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SUMMARY 


This  final  report  outlines  and  discusses  the  results  from  the  Nanoscale  Microelectronic 
Circuit  Development  program  funded  by  the  Air  Force  Research  Laboratory  (AFRL),  Space 
Vehicles  Directorate,  Kirtland  Air  Force  Base,  from  Sept.  8,  2007  to  Dec.  31,  2010.  The  project 
included  two  major  elements:  a  technical/research  focus  and  a  domestic  microelectronics 
workforce  development  effort.  Ten  research  projects  were  conducted  for  the  technical  focus,  and 
24  U.S.  citizen  undergraduate  and  graduate  students  were  awarded  educational-research 
fellowships  under  the  workforce  program.  In  Part  1  of  this  report,  research  details  and  results  are 
outlined  and  discussed,  and  in  Part  2  is  a  detailed  report  describing  results  on  the  domestic 
workforce  development  initiative.  Student  fellowship  research  reports  are  provided  in  Appendix 
A  and  two  invention  disclosures  are  noted  in  Appendix  B. 

The  research  was  conducted  by  the  Center  for  Design  of  Analog-Digital  Integrated 
Circuits  (CDADIC),  which  was  established  in  1989  as  part  of  the  National  Science  Foundation’s 
Industry-University  Cooperative  Research  Center  (I/UCRC)  program.  Research  teams  included 
university  (faculty  and  students)  at  CDADIC’s  three  affiliated  universities:  Washington  State 
University  (WSU),  University  of  Washington  (UW),  and  Oregon  State  University  (OSU). 

The  overall  objective  of  the  technical  thrust  of  this  project  was  to  develop  nanoscale 
microelectronic  circuits  in  five  major  areas  of  importance  to  the  Air  Force  Research  Laboratory. 
As  identified  by  AFRL,  the  five  major  research  task  areas  that  constituted  the  statement  of  work 
for  this  project  were:  1)  design  techniques  for  high  process  variability;  2)  low- voltage/low-power 
circuit  design;  3)  reconfigurable  mixed-signal  circuits;  4)  low-power  radio-frequency  (RF) 
receive/transmit  architectures  and  circuits;  and  5)  improved  nanoscale  device  models  and  circuit 
simulators.  Faculty  and  students  from  CDADIC’s  universities  addressed  all  five  research  areas 
through  10  research  projects.  Table  1  outlines  the  five  research  task  areas  and  CDADIC  projects 
(10)  that  met  various  task  requirements  in  those  focus  areas,  fulfilling  the  statement  of  work  for 
the  technical  component  of  this  work. 

Research  in  design  techniques  for  high-process  variability  (task  area  1)  led  to  a  number  of 
novel  designs  and  circuits.  A  time-to-digital  converter  (TDC)  was  designed  that  used 
oversampling  and  noise-shaping  to  achieve  better  than  an  80dB  dynamic  range.  A  clock  and  data 
recovery  (CDR)  circuit  was  designed  in  90nm  technology  that  had  excellent  characteristics:  high 
device/process  variation  tolerance,  extremely  wide  tuning  range,  very  low  jitter/low  power,  and 
fast  frequency  hopping.  These  CDR  circuits  are  particularly  important  for  high-speed  data 
communication  systems.  Additionally,  a  locked,  low-noise  amplifier  (LNA)  and  an  analog-to- 
digital  converter  (ADC)  that  accommodate  large  process  variations  were  also  developed. 
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Table  1.  CDADIC  Research  Conducted  in  Five  AFRL  Technical  Task  Areas 

The  10  research  projects  conducted  by  CDADIC  during  this  performance  period  met 
requirements  in  all  five  of  the  AFRL  technical  task  areas.  Below  is  a  list  of  the  five  task  areas  and 
the  corresponding  CDADIC  projects  that  met  various  requirements  in  those  task  areas. 

Technical  Task  Area  1:  Design  Techniques  for  High-Process  Variability 

CDADIC  Project  3:  Low-Power  All-Digital  Chip-to-Chip  Interface  Circuits 
by  Pavan  Kumar  Hanumolu  (OSU) 

CDADIC  Project  4:  Nanoscale  Clock  and  Data  Recovery  Circuits,  by  George  La  Rue(WSU) 

CDADIC  Project  7:  Reconfigurable  Master/Slave  Locked  Low  Noise  Amplifiers,  by  Brian  Otis  (UW) 
CDADIC  Project  8:  Configurable  and  Robust  Data  Converters,  by  Gabor  Temes  (OSU) 

Technical  Task  Area  2:  Low- Voltage/Low-Power  Circuit  Design 

CDADIC  Project  1:  Ultra-Low  Power,  Parallel  Serial  Link  Interfaces  Using  Resonant  Clocking  and 
Digitally  Calibrated  Phase/Gain  Process  Compensation,  by  Patrick  Chiang  (OSU) 

CDADIC  Project  3:  Low-Power  All-Digital  Chip-to-Chip  Interface  Circuits 
by  Pavan  Kumar  Hanumolu  (OSU) 

CDADIC  Project  6:  Stochastic  and  Passive  A/D  Techniques  of  Submicron  CMOS, 
by  Un-Ku  Moon  (OSU) 

CDADIC  Project  10:  Wideband  Low-Power  Delta-Sigma  Converter,  by  Gabor  Temes  (OSU) 
Technical  Task  Area  3:  Reconfigurable  Mixed-Signal  Circuits 

CDADIC  Project  3:  Low-Power  All-Digital  Chip-to-Chip  Interface  Circuits 
by  Pavan  Kumar  Hanumolu  (OSU) 

CDADIC  Project  4:  Nanoscale  Clock  and  Data  Recovery  Circuits,  by  George  La  Rue(WSU) 

CDADIC  Project  7:  Reconfigurable  Master/Slave  Locked  Low  Noise  Amplifiers,  by  Brian  Otis  (UW) 
CDADIC  Project  8:  Configurable  and  Robust  Data  Converters,  by  Gabor  Temes  (OSU) 

Technical  Task  Area  4:  Low-Power  RF  Receive/Transmit  Architectures  and  Circuits 

CDADIC  Project  7:  Reconfigurable  Master/Slave  Locked  Low  Noise  Amplifiers,  by  Brian  Otis  (UW) 
CDADIC  Project  9:  A  Low-Power,  Low  Jitter  Fractional-N  Frequency  Synthesizer  with  Wide-tuning 
BAW-stabilized  VCO 

Technical  Task  Area  5:  Improved  Nanoscale  Device  Models  and  Circuit  Simulators 

CDADIC  Project  2:  Advanced  Gate  Models  for  Deep  Submicron  CMOS  Circuit  Simulation, 
by  R.  Bruce  Darling  (UW) 

CDADIC  Project  5:  Coupled  Device  and  Circuit  Simulation  for  Analyzing  the  Effect  of  Random 
Dopant  and  Geometry  Fluctuations  in  Analog/RF  Integrated  Circuits,  by  Karti  Mayaram  (OSU) _ 


In  the  area  of  low- voltage/low-power  circuit  design  (task  area  2),  CDADIC  faculty  made 
a  number  of  important  contributions.  For  future  chip  multiprocessors,  off-chip  I/O  bandwidth 
and  power  efficiency  will  be  critical.  In  one  project,  off-chip  interconnects  were  designed  that 
lowered  power  consumption  to  <  lmW/Gbps,  which  is  more  than  a  two-fold  improvement  over 
the  norm.  For  on-chip  applications,  power  consumption  was  reduced  to  <  0.05mW/Gbps,  which 
is  a  five-fold  improvement  over  traditional  on-chip  single-ended,  inverter  buffers.  In  another 
project,  low-power  chip-to-chip  links  in  deep  submicron  digital  processes  were  developed  that 
resulted  in  all-digital  links  with  better  than  5mW/Gbps  power  efficiencies.  A  state-of-the-art, 
high-speed,  analog-digital  converter  (ADC)  that  is  mostly  synthesizable,  scaleable,  and  robust 
was  designed.  This  ADC  has  many  benefits  in  any  low- voltage  applications  that  require  such  a 
converter  typical  in  mobile  and  distributed  applications.  Novel  design  techniques  were  also 
developed  for  low-power,  wideband  delta-sigma  ADCs.  New  architectural  and  transistor-circuit- 
level  innovations  reduced  power,  while  maintaining  fast  operation  and  high  resolution.  The 
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resulting  design  techniques  are  important  for  data  converters  used  in  communication  systems, 
digital  video,  and  RF  devices. 

Next  generation  military  and  commercial  wireless  systems  will  need  to  accommodate 
extreme  reconfigurability  (task  area  3).  Such  systems  must  have  the  ability  to  cope  with  large 
process  variations  and  design  uncertainty,  operate  across  multiple  wireless  standards,  and  operate 
in  various  interference  scenarios.  To  meet  this  need,  researchers  developed  a  frequency  locked 
LNA  that  accommodates  process-design  uncertainty  and  operates  over  a  wide  range  of  low-noise 
amplifier  (LNA)  bias  points.  A  clock  and  data  recovery  (CDR)  circuit  with  a  digital-controlled 
synthesizer  also  resulted  from  this  work  that  provides  very  fast  frequency  hopping,  extremely 
wide  tuning  range,  and  higher  frequency  resolution  and  low-phase  noise.  Communication 
systems  that  use  this  type  of  circuit  have  more  versatility  than  analog  voltage  controlled 
oscillator  (YCO)-based  synthesizers  and  are  reconfigurable  after  deployment  to  meet  a  wide 
variety  of  military  applications.  Another  CDADIC  project  addressing  reconfigurability  led  to  the 
development  of  a  multi-cell,  delta-sigma  analog-digital  converter  (ADC)  incorporating  digital 
calibration  and  digital  control  of  resolution  and  power  requirements.  The  resulting  digitally 
programmable  ADC  is  a  flexible  and  robust  device  that  is  well  suited  for  implementation  in 
nanoscale  technology  and  for  operation  in  hostile  environments  (temperature  and  radiation). 

Low-power  RF  receive/transmit  architectures  and  circuits  (task  area  4)  is  another  area 
that  is  increasingly  important  to  next  generation  military  and  commercial  wireless  systems. 
Through  this  CDADIC  effort,  low-power,  narrowband  transceiver  architectures  were  developed 
that  provided  a  robust  wide-tuning  range,  which  is  particularly  relevant  for  low-power  handheld 
transceivers.  New  phase-locked  loop  (PLL)  topologies  with  very  low-power  consumption  and 
jitter/phase  noise  were  also  developed.  These  new  topologies  will  help  assess  future  passive 
component  technologies  as  well  as  emerging  frequency  synthesizer  architectures. 

Center  research  that  focused  on  improved  nanaoscale  device  models  and  circuit 
simulators  (task  area  5)  resulted  in  new  device  models  and  a  novel  simulator.  The  new  device 
models  are  more  accurate  for  new  technologies  in  deep  submicron  CMOS  in  45nm  and  smaller 
nodes,  and  provided  updates  to  current  SPICE  MOSFET  simulation  models  that  included  more 
accurate  physical  modeling  of  gate  leakage  processes.  The  novel  simulator  developed  will 
accurately  predict  the  effect  of  parameter  variations  on  the  performance  of  analog  and  RF 
circuits  in  nanoscale  technologies.  Benefits  of  this  simulator  will  result  in  high  scalability,  high 
yield,  and  robustness  with  process  variation. 

The  above  designs  were  realized  as  microelectronic  circuits  using  a  variety  of  fabrication 
process  technologies.  The  fabrications  included  processes  in  65nm-CMOS;  90nm,  1.2V  CMOS; 
0.13pm  CMOS;  and  0.18pm  2P4M  CMOS.  More  information  and  complete  details  of  the  chips 
that  were  fabricated  and  circuit  test  results  describing  their  measured  performance  are  given  in 
each  report. 

In  summary,  the  nanoscale  microelectronic  circuit  development  research  reported  here  as  it 
relates  to  the  five  AFRL  research  themes  resulted  in  improvements  and  innovations  in  the  current 
state-of-the-art  design  of  circuits  used  for  defense  and  military  applications.  However,  to  be 


3 


affordable,  many  of  the  mixed-signal  technologies  must  be  adapted  from  the  technology  being 
developed  for  consumer  products.  The  cost  of  developing  an  independent  approach  for  defense 
systems  is  prohibitive.  Therefore,  the  optimum  approach  is  to  leverage  the  investments  and 
research  and  development  activities  of  the  consumer  electronics  industry.  Thus,  this  research 
effort  built  upon  the  strong  connection  of  CDADIC’s  industry  members  along  with  its  research 
faculty  that  provided  significant  added  value  to  the  final  research  accomplishments.  It  was 
important  that  this  research  activity  provide  benefit  for  some  commercial  applications,  which 
was  also  accomplished  and  described  in  this  report.  Examples  include  benefits  to  commercial 
satellite  systems  and  wireless  applications. 

The  second  thrust  of  this  grant  was  to  aid  in  developing  the  domestic  workforce  in 
microelectronics.  The  goal  of  the  workforce  effort  was  to  attract  more  U.S.  citizen  graduate  and 
undergraduate  students  to  universities  at  the  forefront  in  this  engineering  field.  The  purpose  was 
to  address  the  shortage  of  U.S.  graduate  students  in  this  discipline,  which  not  only  affects  the 
ability  of  universities  to  conduct  important  research,  but  also  hampers  the  ability  of  academia  to 
produce  a  highly  educated,  sustainable  workforce  to  meet  future  research  needs.  The  intent  of  the 
fellowship  program  was  to  encourage  undergraduates  to  pursue  graduate  study  and  careers  in  the 
field  of  microelectronics,  and  to  provide  graduate  students  with  research  opportunities  conducted 
by  leading  faculty  researchers  and  gain  real-world  experience  from  center  industry  members.  A 
two-year  follow-up  of  the  24  students  who  were  awarded  fellowships  (10  were  graduate  students 
and  14  undergraduates)  showed  the  program  was  highly  successful  and  achieved  its  goal:  all 
students  are  now  either  continuing  their  education  in  graduate  school  or  are  now  working  as 
electronics  engineers  in  the  commercial  and  defense  sectors. 

A  final  observation  can  be  made  as  to  the  value  to  the  sponsor  of  this  research  program. 

The  totality  of  the  value  from  10  research  projects  over  a  three-year  period  is  more  than  the 
technical  results  reported  here,  which  are  significant,  but  must  also  include  the  value  to  the 
government  and  defense  industry  resulting  from  the  education  and  training  of  students  who  are 
available  to  enter  the  workforce.  In  addition  to  the  24  students  who  worked  on  this  research 
through  AFRL  fellowships,  an  additional  nine  students  also  participated  in  this  research  activity, 
with  a  total  of  33  students  who  were  provided  educational  and  research  opportunities  through 
this  grant.  These  students,  along  with  their  faculty  advisors,  produced  15  journal  articles  and 
conference  proceedings,  as  well  as  theses  and  dissertations.  Two  invention  disclosures  were  also 
filed  in  connection  with  this  research  effort.  In  addition,  through  AFRL’s  CDADIC  membership, 
additional  technology  was  transferred  to  the  laboratory  at  no  additional  cost,  resulting  in 
additional  dollar  savings  to  AFRL.  It  can  be,  therefore,  concluded  that  this  approach  of 
supporting  research  activity  important  for  defense  and  military  applications  through  an  existing 
industry-university  research  consortium,  such  as  CDADIC,  is  both  a  cost  effective  and 
operationally  efficient  approach  that  should  be  considered  for  future  programs.  The  highly 
successful  domestic  workforce  program  should  also  be  considered  for  continuation. 
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1.  TECHNICAL  FOCUS:  Research  Projects 


In  this  final  report  is  a  description  of  the  research  resulting  from  the  CDADIC/AFRL 
program,  Nanoscale  Microelectronic  Circuit  Development,  performed  from  Sept.  18,  2007  to 
Dec.  31,  2010.  Research  results  from  the  initial  eight  AFRL  projects  conducted  by  CDADIC 
faculty-student  teams,  with  interactions  with  the  center’s  industry  members.  Complete  details 
and  results  of  all  projects  are  provided  in  this  document  along  with  two  additional  projects  that 
were  included  in  an  addendum  to  the  grant  in  fall  2008.  A  total  of  10  research  projects  are 
detailed  in  Part  I  of  this  report.  Each  report  includes  an  abstract,  project  description  and  goals, 
methodology,  and  research  results  and  significant  accomplishments  achieved.  A  comparison  of 
actual  accomplishments  with  the  goals  and  objectives  initially  established  for  the  time  period  are 
given.  Reasons  for  not  meeting  goals  or  objectives  are  described  where  applicable.  Test  chips 
fabricated  during  the  performance  period  also  are  reported,  and  information  on  procedures  and 
methods  are  provided.  Other  information  for  each  report  includes  benefits  to  AFRL  and  the 
commercial  sector,  technology  transfer,  intellectual  property,  publications  and  presentations,  and 
students  working  on  the  project  who  were  AFRL  Fellowship  recipients. 


1.1  PROJECT  1:  LOW-POWER,  OFF-DIE/ON-DIE  PARALLEL  INTERCONNECTS 

1.1.1  Abstract 

The  goal  of  this  project  was  to  understand  the  tradeoffs  between  power  consumption,  data  rate, 
and  process  variation  in  the  design  of  highly  parallel  serial  links  in  deep  submicron  CMOS  processes.  For 
off-chip  applications,  the  milestone  is  ultra  low-power  (<lmW/Gbps),  highly  parallel  (10-40  links),  high¬ 
speed  (lOGbps)  data  rate.  For  on-chip  applications,  the  milestone  is  ultra  low-power  (<0.1mW/Gbps), 
highly  parallel  (200-400  links),  high-speed  (5Gbps)  serial  Links.  Each  of  these  specifications  is  at  least  a 
3-5x  improvement  over  current  aggressive,  low-power  serial  link  topologies. 

This  research  project  explored  new  circuit  architectures  to  reduce  power  consumption  for  these 
highly  parallel  off-chip  and  on-chip  interconnects.  The  project  had  two  main  phases: 

1)  Reduced  power  consumption  for  off-chip  interconnect  to  <  lmW/Gbps,  without  using 
inductors  for  clocking  (due  to  process  scaling  issues  and  backward  data  rate  compatibility). 
This  represents  greater  than  2x  improvement  over  recent  published  papers  by  Rambus  and 
Intel  Circuits  Research. 

2)  Reduced  power  consumption  for  on-chip  interconnects  to  <  0.05mW/Gbps,  which  represent  a 
5x  improvement  over  traditional  on-chip  single-ended,  repeatered  inverter  buffering. 

1.1.2  Project  Description 

This  research  describes  a  quad-channel,  6.4-8Gbps  serial  link  receiver  test  chip  using  a  global 
forwarded  clock  distribution  coupled  to  local-injection  locked  ring  oscillators  in  90nm  CMOS.  Each 
receiver  consists  of  a  low-power,  linear  equalizer,  four  offset-cancelled  quantizers  for  1 :4  demultiplexing, 
and  an  injection-locked  ring  oscillator  for  greater  than  one  UI  of  phase  deskew.  Measured  results  show  a 
6.4-7.2Gbps  data  rate  with  BER  <  10"15  across  10cm  of  FR4  backplane,  and  8.0Gpbs  data  rate  with  direct 
input.  Designed  in  a  1.2V,  90nm  CMOS  process,  the  area  of  each  receiver  is  0.0174mm2,  with  a  measured 
power  efficiency  of  0.6mW/Gbps. 
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Power  efficiency  is  a  key  metric  for  future  wireline  transceiver  applications,  where  hundreds  of 
links  will  be  integrated  on  a  single  chip  [1].  Recent  state-of-the-art  receivers  have  shown  significant 
improvements  in  power  efficiency  by  focusing  on  reducing  dynamic  clock  power  using  resonantly-tuned 
LC  oscillators,  both  in  global  clock  distribution  [2,  4]  and  local  clock  demultiplexing  [3].  This  paper 
presents  further  improvements  in  dynamic  clock  power  consumption  by  implementing  a  low-voltage 
swing,  global  clock  distribution  to  multiple  link  locations,  where  locally-tapped,  injection-locked  ring 
oscillators  (ILRO)  are  used  to  generate  the  quadrature  sampling  clocks.  The  use  of  injection-locked  ring 
oscillators  enables  several  benefits,  including  CMOS  scalability,  large  range  of  data  rates,  and  low-power 
consumption. 

1.1.3  Research  Results  and  Discussion 

The  receiver  block  diagram  (Fig.  1)  shows  the  forwarded,  1.6-2GHz  clock  input,  with  a  CML- 
clock  buffer  driving  a  600um-long  ground  shield  differential  RC  line  to  all  four  link  receivers.  The 
200mVpP  clock  input  is  coupled  to  each  receiver  through  a  local  CML  buffer  that  injects  into  the 
injection-locked  ring  oscillator.  As  shown  in  Fig.  2,  each  injection-locked  oscillator  consists  of  a  voltage- 
to-current  converter  and  a  four-stage  cross-coupled  differential  current-starved  ring  oscillator,  where  all 
the  eight  delay  cells  share  a  single  current  source.  This  current  source  is  32b  thermometer-encoded,  and 
the  step  is  only  30uA  so  as  to  use  it  as  fine  tuning  of  the  free-running  frequency  of  the  oscillator.  The 
phase  deskew  can  be  achieved  by  tuning  this  frequency  when  injected  [5].  Phase  symmetry  is  obtained  by 
using  small  cross-coupled  inverters  between  complementary  phases  and  a  3b  binary  capacitor  bank  on 
each  output  clock  phase  to  individually  trim  any  phase  imbalance  due  to  process  variations  or  layout 
mismatch.  If  the  3b  capacitor  banks  on  all  the  phases  are  tuned  together,  they  also  serve  as  coarse  control 
of  the  free-running  frequency.  While  from  simulation,  the  summing  nodes  CK135  and  CK315  exhibit 
significant  phase  offset  due  to  current  injection,  quadrature  output  phases  CKO,  CK90,  CK180,  and 
CK270  show  a  measured  phase  asymmetry  of  less  than  2ps.  The  measured  injection-locking  ranges  for 
injection  strength  K=0.08,  0.16,  0.24  and  0.32  are  65,  115,  167  and  203MHz  respectively,  where  K  is 
defined  as  the  injected  current  over  oscillator  current  [5].  Fig.  3  shows  that  the  deskew  range  of  ILRO  at 
2.5GHz  injection  is  greater  than  1UI  (>90°  range  for  1:4  demultiplexing),  and  deskew  resolution  as  fine 
as  1.8-3. 6°  due  the  small  step  of  current  source.  The  measured  jitter  transfer  function  of  the  ILRO  is 
shown  in  Fig.  4.  It  is  done  by  injecting  stressed  clock  with  5%UI  amplitude  of  sine  jitter.  The  ILRO 
passes  the  low  frequency  noise  while  rejects  high  frequency  noise  from  the  input  clock,  like  a  1st  order 
PLL.  Its  -3dB  bandwidth  at  K=0.08,  0.16,  0.24  and  0.32  are  approximately  31,  55,  80  and  100MHz 
respectively. 

A  source-degenerated,  linear  equalizer  [2]  is  implemented  in  the  receiver  front-end  to  compensate 
for  up  to  8dB  of  channel  losses  at  4GHz.  Each  quantizer  of  the  1 :4  demux  is  implemented  using  a  two- 
stage  latch,  including  6b  offset  cancellation  by  current-imbalancing.  The  down-multiplexed  recovered 
data  are  buffered  and  can  be  switched  and  selected  to  drive  open-drain  output  pads  for  testing. 

Test  Chips 

The  1mm2  test  chip  (Fig.  7)  implemented  in  a  90nm,  1.2V  CMOS  process  integrates  four 
receivers,  the  global  clock  distribution  network,  the  digital  scan  chain,  and  a  stand-alone  injection-locked 
ring  oscillator  for  test.  A  BertScope  12500B  is  used  for  data  pattern  generation  and  BER  testing,  and  a  HP 
8648D  is  used  as  the  input  clock  source  with  RMS  jitter=900fs  and  pk-pk  jitter=6ps  at  2.5GHz. 

Fig.  5(a)  plots  the  RMS  jitter  of  the  receiver  clock  at  varying  phase  deskew  positions,  showing  low-jitter 
generation  across  the  full  1UI  of  deskew  range.  Fig.  6  shows  the  eye  diagram  of  1:4  demux  1.8Gbps 
output  and  the  measured  bathtub  curve  at  6.4-7.2Gbps  27-l  PRBS  data  input  versus  sweeping  the 
injection-locked  ring  oscillator  phase  position  settings,  across  10cm  of  FR4  backplane  (~5dB  channel  loss 
@  4GHz).  An  8Gbps  data  rate  was  measured  with  no  errors  with  a  direct  cable  input  (~0.8dB  loss  @ 
4GHz).  Thanks  to  the  low  swing  global  clock  distribution  and  local  ILRO  for  phase  deskew,  the  receiver 
consumes  only  3.84,  4.3  and  4.8mW  at  6.4,  7.2  and  8Gpbs  respectively.  Fig.  5(b)  presents  the  detail 
power  breakdown.  The  measured  results  are  summarized  and  compared  in  Table  2  and  3. 
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Figure  1.  Receiver  block  diagram 


Figure  2.  Schematic  of  ILRO 


Figure  3.  (a)  Measured  deskew  range  of  ILRO,  (b)  deskew  steps  at  different  phase  settings 

(x=10ps/div,  y=25mV/div) 


Figure  4.  (a)  Measured  jitter  transfer  of  ILRO,  (b)  one  of  zoomed  jitter  measurements 

(x=4.8ps/div,  y=5mV/div) 
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Figure  5.  (a)  Measured  RX  RMS  jitter,  (b)  power  breakdown  of  RX 


Figure  6.  (a)  1.8Gps  1:4  recovered  data  output  (x=lllps/div,  y=156mV/div),  (b)  bathtub  curve  vs. 

phase  position  settings 


Figure  7.  Die  photo 
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Table  2.  Performance  Summary 


Supply  voltage 

1.2V 

ILRO  tuning  range 

1.6-2.6GHz 

ILRO  locking  range  (K=0.32) 

203MHz 

Phase  deskew  resolution  (K=0.32) 

1. 8-3.6° 

ILRO  power 

0.88mW 

@1.6GHz 

1.08mW 
@  1.8GHz 

1.3mW 

@2GHz 

Total  RX  Power  (including 
amortized  power  of  global  clock  distr.) 

3.84mW 

@6.4Gbps 

4.3mW 

@7.2Gbps 

4.8mW 

@8Gbps 

Area 

RX:  0.0174mm" 
including  ILRO:  O.OOf 

)4mm2 

Table  3.  Comparison  with  Recent  Designs 


m 

m 

This  work 

Data  rate 

6.25Gbps 

27Gbps 

7.2Gbps 

Architecture 

Software  CDR 

Forwarded-CK 

Forwarded-CK 

CK  distr.  type 

Global  LC 

Local  LC 

Local  ring 

RX  Power 

8.22mW 

43  mW 

4.3mW 

Power  efficiency 

1.31  mW/Gb/s 

1.6  mW/Gb/s 

0.6mW/Gb/s 

RX  Area 

-0.15  mm2 

0.015  mm2 

0.0174  mm2 

Technology 

90nm  CMOS 

45nm  CMOS 

90nm  CMOS 

1.1.4  Other  Results 

1.1. 4.1  Technology  Transfer/Intellectual  Property 

Patent:  Injection-Locked  Ring  Oscillator  for  Low-Power  Phase  Deske 
(Disclosure  #OSU-09-02) 

1. 1.4.2  Publications  and  Presentations 

K.  Hu,  T.  Jiang,  J.G.  Wang,  F.  O’Mahony,  and  P.  Chiang,  "A  0.6mW/Gbps,  6.4-8.0Gbps  Serial 
Link  Receiver  Using  Local,  Injection-Locked  Ring  Oscillators  in  90nm  CMOS”,  in  Submission, 
VLSI  Circuits  Symposium ,  June  2009. 

G.  Zhuo,  P.  Chiang,  and  W.  Hu,  ”A  lOGbps/  Wire-line  Transceiver  with  Half  Rate  Period 
Calibration  CDR”,  accepted  to  IEEE  International  Symposium  on  Circuits  and  Systems,  Taipei, 
Taiwan,  May  2009. 

Kang-Min  Hu  and  P.  Chiang,  ’’Comparison  of  on-Die  Global  Clock  Distribution  Methods  for 
Parallel  Serial  Links”,  accepted  to  IEEE  International  Symposium  on  Circuits  and  Systems, 
Taipei,  Taiwan,  May  2009. 

1. 1.4.3  Benefits  to  Commercial  Sector 

Power  efficiency  is  a  key  metric  for  future  bandwidth-constrained  systems.  For  future  chip 
multiprocessors,  off-chip  I/O  bandwidth  and  power  efficiency  are  critical.  State  of  the  art  off-chip  I/Os 
achieve  a  power  efficiency  metric  between  2-5mW/Gbps,  while  also  requiring  LC  resonators  to  improve 
power-efficiency  and  reduce  jitter.  Our  newest  techniques  enable  0.6mW/Gbps  for  the  receiver,  while 
only  using  conventional  ring-oscillator-like  structures,  for  process  portability.  Such  low-power  high- 
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speed  links  will  be  useful  for  massively  parallel,  off-chip  I/O  chips,  required  by  Intel,  AMD,  IBM,  LSI, 

TI,  and  other  member  companies. 

For  on-chip  interconnects,  multicore  processing  is  quickly  becoming  the  dominant  architecture 
for  higher  performance  moving  forward.  However,  such  multicore  processing  is  also  limited  to 
interconnect  power,  moving  data  between  the  different  cores.  Current,  single-ended  inverter-buffered 
wires  achieve  a  power  of  -  0.25mW/Gbps  (for  1mm).  Using  low-swing  techniques  on  die,  both  for 
channels  as  well  as  for  crossbars,  we  are  building  a  low-power,  on-chip  router  to  obtain  ~0.05mW/Gbps 
per  line.  To  prove  this  concept,  we  are  currently  underway  on  a  tapeout  to  achieve  128b,  4-core  network- 
on-a-chip,  running  at  1.5GHz.  Such  on-chip  routers  would  be  useful  for  companies  such  as  AMD,  IBM, 
Linear  Signal,  Semiconductor  Research  Corporation  (SRC)  and  its  affiliated  companies  who  are 
interested  in  reducing  power  consumption  of  multicore  processing. 

1.1.5  Conclusions 

This  research  project  explored  new  circuit  architectures  to  reduce  power  consumption  for  highly 
parallel  off-chip  and  on-chip  interconnects.  The  project  consisted  of  two  phases  that  have  been  completed 
with  results  available.  The  first  phase  reduced  power  consumption  for  off-chip  interconnect  to 
<lmW/Gbps,  without  using  inductors  for  clocking  (due  to  process  scaling  issues  and  backward  data  rate 
compatibility).  This  represents  greater  than  two-times  improvement  over  the  current  norm.  The  measured 
performance  was  0.6mW/Gbps  for  a  7.2Gbps  data  rate.  The  second  phase  reduced  power  consumption  for 
on-chip  interconnects  to  <  0.05mW/Gbps,  which  represent  a  five-times  improvement  over  traditional  on- 
chip  single-ended,  repeatered  inverter  buffering.  The  design  was  for  a  128b,  on-chip  per-core  router  for 
chip-multiprocessor. 

1.1.6  Students  Receiving  AFRL  Fellowships 

None 
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1.2  PROJECT  2:  ADVANCED  GATE  MODELS  FOR  DEEP  SUBMICRON  CMOS 
CIRCUIT  SIMULATION 

1.2.1  Abstract 

Scaling  of  CMOS  is  now  dominated  by  gate  engineering  techniques  which  include  high-k  gate 
dielectrics  (e.g.,  Hf02,  HfN,  HfSiOx)  and  new  gate  metal  materials  (e.g.,  TiN,  TaN).  These  new  materials 
and  gate  stacks  introduce  effects  which  are  not  observed  in  conventional  Si02  gate  MOSFETs,  and  which 
are  not  accurately  represented  in  the  present  generation  of  circuit  simulation  models.  Gate  leakage 
currents  are  a  primary  concern  with  further  device  scaling,  and  high-£  gate  dielectric  stacks  introduce  new 
leakage  mechanisms  such  as  Fowler-Nordheim  tunneling,  Frenkel-Poole  emission,  and  threshold  voltage 
instabilities  which  have  been  related  to  interfacial  defect  states.  These  physical  mechanisms  cause 
significant  departure  from  the  conventional  models  of  gate  conduction  and  capacitance.  This  research 
addresses  the  need  for  accurate  gate  models  to  support  circuit  design  for  this  new  generation  of  CMOS 
gate  technology. 

This  research  work  has  developed  a  gate  leakage  current  model  from  theoretically  fundamental 
calculations  of  the  tunneling  currents  due  to  direct  tunneling,  Fowler-Nordheim  tunneling,  Shockley- 
Read-Hall  defect  state  recombination,  and  Frenkel-Poole  defect  state  emission.  This  model  has  been 
coded  as  a  4-terminal  SPICE  device  model  which  can  be  added  in  parallel  to  an  existing  MOSFET  to 
more  accurately  represent  the  gate  current  components.  The  model  has  been  used  to  simulate  a  selection 
of  benchmark  circuits,  both  digital,  analog,  and  mixed-signal,  in  several  of  the  more  promising  high-£ 
gate  dielectric  process  technologies.  This  model  can  help  to  simulate  the  performance  effects  of  porting 
an  existing  circuit  design  in  a  conventional  Si02  gate  process  to  a  high-A:  gate  dielectric  process.  It  will 
also  provide  a  means  for  predicting  the  impact  of  a  gate  stack  modification  on  down-stream  circuit 
performance,  and  for  selecting  the  most  appropriate  CMOS  gate  technology  for  a  particular  application. 


1.2.2  Project  Description 

The  scaling  of  CMOS  transistors  over  the  past  ITRS  nodes  (500-350-250-180-130  nm)  has  been 
dominated  by  channel  engineering  techniques  which  have  worked  to  counteract  short  channel  effects  by 
means  of  shallower  and  more  abrupt  drain  and  source  junctions,  threshold  voltage  adjustment  implants, 
graded  and  retrograde  wells,  pocket  implants,  halo  implants,  source  and  drain  extensions,  and  graded 
channel  implants,  all  while  largely  maintaining  the  fundamental  gate  stack  of  poly-Si/Si02/Si  [6].  As 
MOSFET  channel  lengths  have  decreased,  gate  oxide  thicknesses  have  also  been  proportionally  reduced, 
now  to  the  point  where  leakage,  tunneling,  breakdown,  and  charge  trapping  have  emerged  as  fundamental 
limitations  to  Si02,  at  nominal  oxide  thickness  (Tox)  of  5-7  nm  and  below.  The  current  and  future  ITRS 
nodes  (90-65-45-33  nm)  are  now  dominated  by  gate  engineering  techniques  with  the  goal  of  providing 
equivalently  high  oxide  capacitances  (Cox)  with  improved  leakage,  breakdown,  and  charge  trapping 
performance,  beyond  what  can  be  achieved  with  the  fundamental  poly-Si/Si02/Si  gate  stack. 

Numerous  high -k  gate  dielectric  materials  have  been  under  investigation  over  the  past  decade,  but  today 
several  metal  oxides  are  emerging  as  the  primary  candidates  to  both  replace  and  augment  Si02  in  the 
present  and  future  deep  submicron  CMOS  processes.  Foremost  among  these  is  hafnium  oxide  (Hf02) 
with  a  relative  dielectric  constant  of  about  25,  as  compared  to  3.9  for  Si02  [7].  This  large  increase  in 
dielectric  constant  allows  a  thicker  layer  of  Hf02  to  be  used  in  place  of  Si02  while  creating  an  equivalent 
value  of  Cox.  The  thicker  dielectric  layer  then  supports  greatly  reduced  leakage  currents  and  higher 
breakdown  voltages  than  the  equivalent  thickness  of  Si02,  as  a  direct  result  of  the  lower  electric  field  in 
the  oxide.  While  the  Hf02  gate  dielectrics  have  been  under  intense  research  and  development,  they  still 
have  several  problems,  including  polaron  effects  stemming  from  their  inherently  high  polarizability,  and 
charge  trapping  induced  threshold  voltage  instabilities.  These  have  been  referred  to  in  the  literature  as 
positive-bias  temperature  instabilities  (PBTI),  mostly  affecting  nMOS  devices,  and  negative-bias 
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temperature  instabilities  (NBTI),  mostly  affecting  pMOS  devices  [7-10].  Both  effects  appear  to  arise 
from  oxygen-related  defects  at  the  Hf02/Si  interface  [11]. 

As  a  means  for  reducing  these  PBTI  and  NBTI  effects,  the  more  successful  gate  structures  have 
retained  a  thin  layer  of  Si02  next  to  the  Si,  and  thus  mixed-oxide  gate  dielectric  stacks  are  becoming 
increasingly  popular,  e.g.,  poly-Si/Hf02/SiO2/Si  [12].  Oxy-nitrides,  which  were  one  early  step  towards 
increasing  the  gate  dielectric  constant,  have  also  been  developed  for  the  metal  oxides  as  well,  e.g.,  SiOxNy 
has  been  transitioned  to  HfOxNy.  Other  high-£  dielectric  materials  are  also  being  explored,  including 
HfSiOx,  HfN,  HfLaO,  HfON,  HfTaON,  and  HfTaTiO,  the  latter  of  which  has  a  remarkably  high  dielectric 
constant  of  56  [13].  Besides  a  wide  variety  of  new  materials  and  a  more  complex  stack  up  of  dielectric 
layers,  modern  deep  submicron  CMOS  gates  are  also  incorporating  a  trend  back  towards  metal  gate 
materials  and  away  from  polysilicon,  both  for  reasons  of  higher  conductivity  as  well  as  better  matching  of 
electron  work  functions.  The  most  prevalent  metal  gate  material  under  development  is  now  TaN, 
although  there  has  been  significant  R&D  on  TiN  as  well.  Currently,  several  of  the  major  US  (IBM,  Intel) 
and  Taiwanese  (TSMC)  semiconductor  manufacturers  are  finalizing  development  of  45  and  33  nm 
processes  which  include  Hf02  gate  dielectrics  with  TaN  gate  metals. 

As  process  technologies  progress  forward,  a  continuing  challenge  is  to  provide  accurate  device 
models  for  circuit  simulation.  Over  the  past  decade,  the  most  prevalent  device  models  have  been  the 
BSIM3,  BSIM4,  and  their  derivative  models  from  U.  C.  Berkeley  [14].  More  recently,  the  Compact 
Modeling  Council  (CMC)  has  chosen  the  Penn  State  -  Philips  surface  potential  model,  PSP  100,  as  the 
new  standard  for  deep  submicron  CMOS.  While  both  the  BSIM  and  PSP  models  have  done  reasonably 
well  at  tracking  the  technology  thus  far,  both  have  been  fundamentally  tuned  for  the  conventional  poly- 
Si/Si02/Si  gate  stack:  they  are  designed  for  dielectric  constants  fairly  close  to  that  of  Si02  (3.9),  they  do 
not  allow  for  multiple  layer  oxide  stacks,  and  they  do  not  include  effects  associated  with  interfacial 
charge  trapping  (both  fixed  charge  and  dynamic  “fast  state”  interface  charge)  at  interfaces  other  than  that 
of  Si02/Si.  Both  the  BSIM  and  PSP  models  use  a  single  oxide  thickness  parameter  (TOX)  to  compute  the 
oxide  capacitance  and  channel  gating  function.  While  this  can  be  adjusted  somewhat  to  be  used  as  an 
“electrically  equivalent”  oxide  thickness  (sometimes  denoted  TOXE),  it  still  fundamentally  refers  to  a 
single  layer  of  Si02  from  which  any  gate  leakage  processes  are  derived.  As  high-k  gate  dielectrics,  metal 
gates,  and  multi-layer  gate  stacks  become  more  commonplace,  it  is  anticipated  that  the  BSIM  and  PSP 
models  will  have  to  undergo  a  major  renovation  to  retain  predictive  modeling  accuracy. 

One  of  the  most  important  areas  for  predictive  deep  submicron  device  modeling  is  gate  leakage 
current  and  gate  capacitance  effects.  Gate  leakage  is  now  a  well-documented  limitation  for  both  high- 
density  digital,  mixed-signal,  analog,  and  RF  CMOS  [6].  Ultrathin  gate  oxides  introduce  a  number  of 
leakage  mechanisms,  including  direct  tunneling  of  electrons  from  both  the  conduction  and  valence  bands 
through  the  oxide  into  the  gate,  direct  tunneling  of  holes  through  the  oxide  into  the  substrate,  and 
injection  of  hot  carriers  from  the  channel  or  substrate  into  the  gate  oxide  [15,  16].  These  leakage  currents 
are  manifested  as  current  components  in  all  four  terminals  of  the  MOSFET  (S,  G,  D,  B).  In  high-density 
digital  circuits,  these  currents  constitute  static  leakage  with  a  power  dissipation  that  is  proportional  to 
VDD.  Dynamic  CMOS  power  dissipation  traditionally  follows  a  VDD2  dependence,  so  lowering  of  the 
power  supply  voltage  has  much  less  effect  on  reducing  the  static  leakage  power  dissipation.  For  high 
frequency  circuits,  the  gate  leakage  current  is  higher  still  because  of  capacitive  coupling,  and  this  can 
become  large  enough  that  the  MOSFET  starts  to  behave  more  like  a  low-(3  bipolar  transistor.  With 
ultrathin  gates,  the  classic  assumption  that  the  MOSFET  has  infinite  input  impedance  completely  breaks 
down.  Gate  tunneling  currents  were  added  to  the  BSIM4  model  as  an  improvement  over  BSIM3  which 
did  not  include  these  effects,  but  these  tunneling  currents  are  still  oriented  toward  only  pure  Si02  gate 
dielectrics.  The  gate  leakage  currents  of  even  comparatively  simple  oxy-nitrides  are  not  modeled  very 
accurately  by  BSIM4.  The  PSP  model  also  includes  terms  for  gate  tunneling,  but  these  are  not  computed 
internally  and  involve  coefficients  which  must  be  obtained  from  parametric  measurements.  One 
advantage  of  the  PSP  model,  however,  is  that  the  gate  tunneling  currents  are  computed  self-consistently 
with  the  surface  potential  of  the  channel,  making  the  division  of  the  gate  current  into  its  drain,  body,  and 
source  components  fairly  accurate. 
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We  have  developed  an  improved  gate  model  for  deep  submicron  CMOS  processes  that  is 
designed  to  handle  a  wide  variety  of  high-£  gate  dielectrics,  multiple-layer  gate  stacks,  metal  gates,  and 
the  associated  leakage  current  mechanisms  that  are  now  part  of  modem  CMOS  fabrication  processes.  We 
have  chosen  to  not  completely  rebuild  the  existing  BSIM4  or  PSP  models;  instead,  our  gate  leakage 
model  is  designed  to  be  used  as  an  augmentation  for  the  existing  BSIM4  or  PSP  MOSFET  models.  (It 
will  also  work  with  earlier  models,  such  as  BSIM3,  too.)  The  existing  BSIM4  and  PSP  source  codes  are 
complex,  and  our  model  is  inserted  simply  by  turning  off  the  existing  gate  conduction  model  within  these 
models  by  a  parameter  switch,  and  then  attaching  our  new  gate  model  in  parallel  with  the  (G,  S,  D,  B) 
terminals  of  the  existing  BSIM4  or  PSP  model,  which  allows  the  terminal  currents  to  simply  add  together. 
Most  SPICE  simulators  support  a  macro  or  sub-circuit  model  definition,  and  this  can  be  used  effectively 
to  combine  the  native  BSIM4  or  PSP  model  in  parallel  with  our  gate  conduction  model,  making  the 
overall  composite  model  transparent  to  the  circuit  designer. 

To  embody  a  greater  level  of  physical  accuracy,  the  model  starts  from  a  first  principles 
description  of  the  electrostatics  of  a  high-£  gate  dielectric  stack  and  adds  to  that  the  semi-classical 
Shockley-Read-Hall  (SRH)  dynamics  for  known  trap  states  within  these  layers,  and  for  any  interfacial 
states  between  adjacent  pairs  of  layers.  This  level  of  description  of  the  bulk  and  interface  defect  states 
leads  directly  to  computing  the  steady-state  generation/recombination  currents  as  well  as  the  dynamic 
charge-trapping  capacitances  of  the  overall  gate  dielectric  stack.  This  methodology  is  sufficiently  general 
that  it  allows  both  carrier  band  edge  trapping/de-trapping  processes  and  Frenkel-Poole  (FP)  emission  [11] 
to  be  handled  simultaneously  and  self-consistently. 

Tunneling  currents  are  computed  using  the  quantum-mechanical  Wentzel-Kramers-Brillouin 
(WKB)  approximation  for  the  three  principal  cases  of  (i)  electron  tunneling  from  the  conduction  band 
through  the  gate  oxide  layer  into  the  gate  (ECB),  (ii)  electron  tunneling  from  the  valence  band  through  the 
gate  oxide  layer  into  the  gate  (EVB),  and  (iii)  hole  tunneling  through  the  gate  oxide  into  the  valence  band 
of  the  substrate  (HVB).  (i)  is  the  dominant  mechanism  for  strongly  inverted  n-channel  MOSFETs,  while 
(iii)  is  the  dominant  mechanism  for  strongly  inverted  p-channel  MOSFETs.  A  quantum  mechanical 
reflection/transmission  correction  term  is  introduced  to  treat  the  abrupt  conduction  and  valence  band 
offsets  which  arise.  The  presence  of  multiple  dielectric  layers  can  lead  to  some  special  situations,  such  as 
tunneling  through  one  layer  with  drift/diffusion  transport  through  the  conduction  band  of  the  other,  and 
these  are  identified  and  treated  separately.  One  important  effect  is  the  introduction  of  significant  Fowler- 
Nordheim  (FN)  tunneling  through  the  gate  dielectric,  which  is  normally  minimal  in  thin  Si02  gate 
dielectrics,  but  significant  in  high-£  gate  dielectrics  such  as  Hf02  [17,18].  Our  computational  approach 
handles  FN  tunneling  in  precisely  the  same  manner  as  direct  tunneling,  thereby  integrating  these  two 
mechanisms  into  one  single  tunneling  integral. 

An  overall  self-consistent  set  of  steady-state  terminal  currents  (G,  S,  D,  B)  are  lastly  computed 
from  the  tunneling  integrals  of  the  gate  dielectric  stack,  including  the  usual  Igcs,  Igcd,  and  Igb.  The  most 
significant  computation  challenge  is  to  properly  divide  the  gate-channel  tunneling  current  into  its  source, 
body,  and  drain  components.  This  requires  a  knowledge  of  the  surface  potential  profile  of  the  MOSFET 
channel.  Our  present  model  approximates  this  in  a  simplified  manner,  and  this  is  one  area  in  which  it  can 
be  improved  in  the  future.  Leakage  currents  through  the  gate-source  and  gate-drain  overlap  regions  (Igso 
and  Igd0)  are  also  included  for  consistency  with  existing  BSIM4  and  PSP  model  parameters.  These  current 
components  are  illustrated  in  Fig.  8(a),  and  the  direct  tunneling  current  mechanisms  in  a  conventional 
Si02  gate  nMOSFET  are  shown  in  Fig.  8(b). 
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n+-poly-Si  gate  Si02  p-S i  substrate 

Figure  8.  (a)  Gate  leakage  current  components  between  device  terminals,  (b)  Gate-dielectric-channel  energy 
band  diagram  with  principal  tunneling  leakage  current  mechanisms  shown 


The  introduction  of  a  high-k  gate  dielectric  material,  over  top  of  a  thin  Si02  spacer  layer,  is 
shown  in  Fig.  9  (a,b)  for  p-  and  n-channel  MOSFETs,  along  with  the  tunneling  current  mechanisms. 
While  the  high-k  gate  dielectric  increases  the  overall  thickness  of  the  gate  dielectric  stack  and  lowers  the 
average  electric  field  within  the  stack,  the  lower  conduction  band  edge  of  the  high-k  material,  e.g.  Hf02, 
creates  a  smaller  tunneling  integral,  the  net  effect  of  which  does  not  necessarily  reduce  the  leakage 
current.  For  the  predominant  JEcb  current  in  Fig.  9(b),  electrons  only  need  to  tunnel  to  the  conduction 
band  edge  of  the  Hf02  layer. 
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Figure  9.  Energy  band  diagrams  for  nMOS  and  pMOS  FETs  with  high-A;  gate  dielectric  stacks  and  principal 

tunneling  leakage  current  mechanisms  shown 


1.2.3  Research  Results  and  Discussion 

The  research  project  tasks  have  been  divided  into  two  parts:  (1)  creating  a  semi-classical 
computational  model  for  the  gate  leakage  currents  in  a  high -k  gate  dielectric  stack  MOSFET,  and  (2) 
applying  this  model  to  the  simulation  of  various  benchmark  CMOS  circuits  across  several  different  high-£ 
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gate  dielectric  processes,  as  well  as  conventional  Si02  gate  CMOS.  The  objective  has  been  to  create 
predictions  of  the  performance  changes  caused  by  porting  a  given  benchmark  circuit  from  one  process  to 
another,  and  understanding  the  effects  of  high-£  gate  dielectrics  on  the  performance  of  specific  circuits. 
The  first  task  in  creating  the  physics-based  model  has  been  to  compute  the  direct  tunneling  currents  for 
the  various  high-£  gate  dielectric  stacks  as  a  function  of  applied  gate  bias.  Simulation  codes  (in  Matlab 
and  Mathcad)  have  been  developed  to  compute  the  band  structure  profile  of  the  gate  stack.  One  important 
piece  of  this  process  has  been  gathering  band  structure  parametric  data  for  the  high-k  gate  dielectric 
materials,  including  bandgap,  conduction  and  valence  band  offsets,  accurate  permittivities,  and  common 
defect  state  energies.  This  data  has  been  gathered  for  the  most  common  high-£  gate  insulators,  and  with 
this  data,  the  Poisson-Boltzmann  equation  can  be  solved  numerically  to  yield  the  equilibrium  band 
structure  profile  as  a  function  of  applied  bias  across  the  gate  stack. 

Once  the  band  structure  profile  is  available  in  an  analytic  or  numerical  form,  the  direct  tunneling 
probabilities  are  then  computed  using  a  modified  version  of  the  WKB  approximation  which  takes  into 
account  the  conduction  and  valence  band  offsets  at  the  material  interfaces.  This  modification  is  necessary 
because  the  WKB  approximation  normally  applies  to  smoothly  varying  potentials,  whereas  the  band  edge 
offsets  are  atomically  abrupt.  The  development  of  these  computational  codes  are  complete  and  form  the 
core  of  the  new  high-k  gate  dielectric  MOSFET  model. 

Another  effect  which  must  be  included  in  computing  the  gate  leakage  currents  is  carrier  transport 
through  trapping  centers.  High-£  gate  materials  have  been  known  to  have  higher  densities  of  trap  centers 
which  can  produce  unwanted  effects  such  as  positive-bias  temperature  instabilities  (PBTI)  in  nMOSFETs 
and  negative-bias  temperature  instabilities  (NBTI)  in  pMOSFETs  [7-13].  Trapping  defects  are  normally 
concentrated  at  the  material  interfaces,  such  as  the  oxygen  vacancy  and  interstitial  defects  in  Hf02/Si02 
interfaces  which  have  been  found  to  produce  a  number  of  threshold  voltage  instabilities  and  increased 
gate  leakage  current.  The  transport  of  carriers  through  these  defect  centers  can  be  described  by  capture 
and  emission  time  constants,  along  with  defect  center  density,  energy,  and  type.  Shockley-Read-Hall 
(SRH)  capture  and  emission  rates  are  used  to  describe  carrier  transport  through  these  defect  centers,  and  a 
model  for  common  interfaces  (whose  defect  statistics  have  been  measured  in  the  literature)  has  then  been 
assembled  from  these  parameters.  Frenkel-Poole  (FP)  emission  from  these  defect  centers  is  another 
process  which  can  contribute  to  gate  leakage  current  under  certain  circumstances  [11].  This  is  also 
computed  from  the  reported  capture  and  emission  statistics  of  the  defect  center. 

To  illustrate  these  gate  leakage  tunneling  current  computations,  two  benchmark  cases  are 
considered.  Both  use  process  parameters  in  the  range  typical  for  a  45 -nm  process  node,  as  defined  by  the 
ITRS  [14].  These  are  structured  for  two  digital  process  variants,  a  high  performance  process  that  achieves 
a  higher  on-state  current  at  the  expense  of  a  higher  gate  leakage,  and  a  low  power  process  that  minimizes 
the  gate  leakage  at  the  expense  of  on-state  current.  The  parameters  used  for  the  benchmark  simulations 
are  chosen  to  lie  toward  the  middle  of  each  of  these  variants.  Specifically,  the  equivalent  (Si02)  oxide 
thickness  (EOT)  is  taken  to  be  0.8  nm;  the  power  supply  is  taken  to  be  VDD  =  0.8  Volts;  threshold 
voltages  are  adjusted  to  VTn  =  -VTp  =  0.25  Volts;  source  and  drain  junction  depths  are  set  at  xj  =  35  nm; 
and  parasitic  elements  set  appropriate  to  dual  poly  gates,  shallow  trench  isolation,  and  retrograde  doped 
wells.  With  these  common  parameters,  an  ultrathin  Si02  gate  and  a  high-k  gate  are  compared,  both  with 
the  same  EOT  of  0.8  nm.  The  ultrathin  Si02  gate  stack  involves  only  0.8  nm  of  Si02,  while  the  high-k 
gate  stack  consists  of  a  0.4  nm  Si02  spacer  layer  capped  by  2.5  nm  of  Hf02  to  produce  the  same  EOT. 

Figures  10  and  11  compare  the  computed  gate  leakage  tunneling  current  density  for  these  two 

cases. 
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Figure  10.  Gate  tunneling  leakage  current  density  for  ultrathin  0.8  nm  Si02  gate  oxide 
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Figure  11.  Gate  tunneling  leakage  current  density  for  Hf02/Si02  high-k  gate  dielectric  with  0.8  nm  EOT 


Both  of  the  above  cases  shown  are  for  the  source,  body,  and  drain  grounded,  so  that  the  gate 
current  represents  the  sum  of  the  leakage  currents  into  each  of  the  these  three  paths.  The  point  of 
comparison  should  be  where  the  gate  voltage  is  equal  to  the  supply  voltage,  VGs  =  VDD  =  0.8  Volts.  For 
sub-100-nm  processes,  the  usually  cited  acceptable  threshold  for  gate  leakage  current  is  about  1  A/cm2, 
which  for  most  minimum  sized  devices  produces  leakage  currents  in  the  100  nA  range.  As  can  be  seen, 
the  ultrathin  Si02  gate  process  far  exceeds  this  limit,  although  from  a  practical  perspective,  any  process  at 
this  level  would  actually  be  using  an  oxynitride  gate  which  will  lower  this  value  by  at  least  two  decades. 
The  Hf02/SiO2  high-k  gate  process,  however,  is  just  slightly  below  the  gate  leakage  current  limit, 
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exhibiting  the  approximately  two  decades  of  improvement  that  have  been  reported  in  the  literature  for  this 
system  when  compared  against  an  oxynitride  gate  stack.  While  this  easily  illustrates  the  advantages 
afforded  by  the  high-£  gate  dielectric  stack,  these  computations  also  reveal  one  of  the  limitations,  which 
can  be  seen  when  the  gate  voltage  is  taken  higher,  as  shown  in  Fig.  12.  Above  VGs  =  1.5  V,  the  gate 
leakage  tunneling  current  density  increases  sharply,  because  of  the  transition  from  direct  to  Fowler- 
Nordheim  tunneling  as  the  lower  conduction  band  edge  of  the  Hf02  layer  falls  below  the  energy  of  the 
electrons  in  the  n-channel  inversion  layer. 


Figure  12.  Gate  tunneling  leakage  current  density  for  Hf02/Si02  high-k  gate  dielectric  with  0.8  nm  EOT 


Figure  12  also  illustrates  that  temperature  effects  are  carried  through  each  of  the  computational 
stages,  and  confirms  the  reported  feature  that  the  temperature  dependence  is  most  significant  in  the 
subthreshold  region.  The  double  curved  nature  of  the  above  tunneling  current  in  a  high-£  gate  stack  is 
one  feature  which  is  not  modeled  properly  by  the  existing  gate  conduction  models  in  B SIM-4. 5.0  and 
PSP-102.0  circuit  simulation  models  [10-13].  The  gate  conduction  model  used  in  BSIM-4.5.0  and  in 
PSP- 102.0  is  based  upon  pure  Fowler-Nordheim  tunneling  through  a  nominal  Si02  gate  insulator.  While 
both  of  these  implementations  can  be  adjusted  in  their  parameters  to  provide  a  reasonable  match  to  the 
gate  leakage  tunneling  currents  over  a  restricted  range,  neither  can  model  the  more  complex  behavior  of 
the  tunneling  current  in  a  high-£  gate  stack  over  the  wider  bias  and  temperature  range  illustrated  above. 

The  most  important  component  of  the  gate  leakage  tunneling  current  is  Igc,  which  flows  from  the 
gate  to  the  channel  under  conditions  of  strong  inversion.  This  current  Igc  divides  into  three  paths, 
depending  upon  whether  the  current  exits  the  device  through  the  source,  body,  or  drain  terminal,  and  the 
division  of  Igc  into  these  three  components,  (Igcs,  Igb,  Igcd)  is  determined  by  the  state  of  the  channel,  as 
given  by  its  surface  potential  profile,  \|/s(x).  Two  other  gate  leakage  components  are  commonly 
implemented  in  most  circuit  models,  and  these  are  the  gate  leakage  currents  associated  with  the  gate- 
source  and  gate-drain  overlap  regions,  Igso  and  Igdo.  While  these  are  in  principle  computed  in  the  same 
manner  as  the  primary  gate  tunneling  current  Igc,  these  two  depend  more  heavily  upon  the  specific  two- 
dimensional  geometry  of  the  gate  and  its  oxide  sidewalls,  and  these  two  current  components  were  not 
implemented  in  the  present  work. 

The  BSIM-4.5.0  model  implements  three  independent  equations  for  the  Igcs,  Igb,  and  Igcd  tunneling 
components,  and  each  can  be  tuned  to  the  bias  dependence  through  a  set  of  parameters.  The  PSP- 102.0 
model  uses  the  surface  potential  profile,  previously  computed  as  part  of  the  routine  for  finding  the  drain 
current,  to  establish  the  split  of  Igc  into  its  three  components  of  Igcs,  Igb,  and  Igcd.  This  is  much  more 
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accurate  and  physically  representative,  and  one  of  the  stronger  features  of  the  PSP  modeling  strategy.  As 
such,  the  PSP- 102.0  model  is  more  receptive  to  augmentation  by  the  above  tunneling  computations. 

To  make  the  results  of  this  work  more  widely  usable,  the  model  for  the  gate  leakage  tunneling  currents 
was  composed  so  that  it  could  be  placed  in  parallel  with  any  existing  circuit  simulation  model  for  the 
MOSFET.  For  those  models  which  already  include  a  gate  conduction  model,  such  as  BSIM-4.5.0  or  PSP- 
102.0,  the  gate  conduction  in  these  models  is  simply  turned  off  by  a  parameter  switch,  and  the  new  gate 
leakage  tunneling  model  is  placed  in  parallel  with  existing  MOSFET  model,  so  that  each  of  the  device 
terminal  currents  simply  add  together. 

The  model  is  implemented  in  two  stages,  by  first  running  the  tunneling  computations  over  the 
appropriate  range  of  gate  bias,  extracting  several  parameters  from  these  computations,  and  then  using 
these  parameters  in  the  more  compact  circuit  simulation  model  which  is  implemented  in  C/C++,  and 
which  can  be  linked  into  most  of  the  usual  SPICE  circuit  simulators.  The  tunneling  computations  would 
in  practice  be  implemented  in  the  pre-computation  phase  when  the  model  is  first  loaded.  The  model  was 
implemented  in  Tanner  EDA  T-Spice®  and  composite  macro  models  for  both  n-channel  and  p-channel 
MOSFETs  were  created  in  S-Edit®  which  emulate  the  existing  NMOS  and  PMOS  models  that  are  used  in 
all  SPICE  implementations.  Thus,  from  a  circuit  schematic  perspective,  the  device  symbol  and  SPICE 
model  parameters  still  look  like  they  normally  do,  but  the  gate  conduction  model  transparently  replaces 
whatever  gate  conduction  model  the  MOSFET  model  level  may  have  originally  specified.  An  example  of 
this  is  shown  in  Fig.  13  for  a  simple  CMOS  inverter. 


Figure  13.  Tanner  EDA  S-Edit®  view  of  CMOS  inverter  with  implemented  gate  conduction  model 


With  a  functional  compact  model  now  available  for  the  gate  leakage  tunneling  currents,  the 
tradeoffs  in  process  technology  and  their  impact  on  specific  circuits  can  be  readily  evaluated.  This  also 
includes  the  use  of  TaN  or  TiN  metal  gates  as  well  as  any  practical  high-£  gate  dielectric  stack  up. 
Because  this  project  was  only  funded  for  its  first  year,  the  scope  of  the  benchmark  circuits  and  candidate 
process  technologies  was  reduced  to  a  subset  of  those  originally  planned.  The  benchmark  circuits  that 
were  evaluated  include: 
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1.  static  CMOS  inverter 

2.  static  NAND  and  NOR  gates 

3.  sample/hold  gate 

4.  switched  capacitor  integrator 

The  simulated  process  technologies  include:  (both  prototypes  for  a  45-nm  process  node) 

1 .  conventional  ultrathin  poly-Si/Si02/Si  gate  stack,  EOT  =  0.8  nm 

2.  high-^poly-Si/Hf02/Si02/Si  gate  stack,  EOT  =  0.8  nm 


As  an  illustration  of  one  of  these  permutations,  Fig.  14  shows  the  gate  input  current  into  the 
CMOS  inverter  of  Fig.  14  as  a  function  of  V^. 


Figure  14.  T-Spice®  simulation  of  a  CMOS  inverter  for  a  high-A:  gate  dielectric  process 

at  a  45  nm  process  node 


Although  this  particular  circuit  was,  for  simplicity,  parameterized  symmetrically  for  the  nMOS 
and  pMOS  transistors,  the  simulation  still  illustrates  several  features  of  gate  leakage  on  static  CMOS 
logic.  The  red  curve  is  the  current  flowing  into  the  gate  of  the  nMOS  transistor;  the  blue  curve  is  the 
current  flowing  into  the  gate  of  the  pMOS  transistor,  and  the  green  curve  is  the  sum  of  the  two,  which  is 
the  current  flowing  into  the  combined  input  to  the  CMOS  inverter.  The  most  important  feature  is  that  the 
leakage  current  is  minimum  when  the  input  and  output  are  at  VDD/2  =  400  mV,  and  that  the  leakage 
current,  and  also  the  static  power  dissipation  through  the  input,  is  maximum  when  the  input  is  either  at 
GND  or  VDD.  It  is  an  important  observation  that  current  flow  and  power  dissipation  on  the  input  (gate 
side)  of  the  inverter  is  exactly  opposite  to  dissipation  on  the  output  (drain-side)  of  the  inverter.  This  is 
also  tme  for  higher  order  static  CMOS  logic  functions,  such  as  NAND,  NOR,  XOR,  and  AOI. 

Several  circuit  techniques  have  been  developed  to  deal  with  the  higher  leakage  currents  of 
ultrathin  gate  oxides,  including:  (1)  transistor  stacks  for  standby  leakage  control,  (2)  multiple  threshold 
voltage  designs,  (3)  dynamic  threshold  voltage  designs,  (4)  supply  voltage  scaling,  and  (5)  specific 
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leakage  current  reduction  techniques  for  cache  memory  [6].  The  above  modeling  approach  can  provide  a 
mechanism  to  evaluate  the  effectiveness  of  these  techniques  with  high-£  gate  dielectric  stack  process 
technologies. 

In  summary,  this  project  has  implemented  an  improved  gate  leakage  tunneling  model  that  can  be 
applied,  in  principle,  to  any  gate  dielectric  stack,  involving  oxide,  nitride,  oxynitride,  or  high-£  materials, 
in  any  arbitrary  order  and  thicknesses.  The  resulting  tunneling  currents  have  corresponded  very  closely  to 
those  values  cited  in  the  literature  for  equivalent  gate  processes,  and  the  model  has  been  implemented  in  a 
compact  form  for  use  in  SPICE  circuit  simulators.  The  model  allows  a  circuit  designer  to  quickly 
evaluate  changes  in  process  technology  without  having  detailed  test  data  from  previously  fabricated  test 
structures.  In  other  words,  the  model  does  not  require  any  sensitive  fitting  parameters,  although  it  does 
require  accurate  process  data  to  describe  the  gate  stack.  This  modeling  strategy  eliminates  the  need  to  use 
effective  oxide  thicknesses,  and  instead  the  detailed  gate  stack  is  used  directly  in  the  tunneling  current 
computations. 

Test  Chips 

None  scheduled  for  this  project. 


1.2.4  Other  Results 

1.2.4.1  Technology  Transfer/Intellectual  Property 

Device  model  codes  are  used  in  Mathcad,  MATLAB,  and  C/C++ 

1.2.4.2  Resulting  Publications  and  Presentations 

None  yet. 

1.2.4.3  Benefits  to  Commercial  Sector 

The  resulting  gate  conduction  model  should  be  much  more  accurate  and  flexible  than  existing 
device  models  and  should  thus  find  broad  utility  for  design  simulation  of  deep  submicron  CMOS  circuits. 
A  principal  use  of  the  model  will  be  in  evaluating  the  performance  changes  that  would  occur  when  a 
circuit  is  ported  from  a  conventional  Si02  gate  deep-submicron  CMOS  process  to  a  high-£  gate  dielectric 
(and  potentially  metal  gate)  CMOS  process.  Because  the  model  will  be  based  upon  a  fairly  detailed 
physical  description  of  the  gate  structure,  it  will  also  provide  a  tool  for  evaluating  the  effects  of  different 
gate  dielectric  stack  ups,  as  well  as  the  effects  of  various  defect  states  within  these  layers,  and  their  down 
stream  ramifications  on  circuit  performance.  The  model  could  also  be  extended  further  to  treat  floating 
gate  memory  devices,  such  as  flash  or  EEPROM. 

1.2.5  Conclusions 

This  research  led  to  the  development  of  a  gate  leakage  current  model  from  quantum-mechanical 
WKB  calculations  of  the  tunneling  currents,  including  direct  tunneling,  Fowler-Nordheim  tunneling,  and 
current  flow  through  mid-gap  defect  states.  The  gate-to-channel  tunneling  current  is  then  divided  into 
source,  body,  and  drain  components,  based  upon  the  surface  potential  profile  of  the  channel  during 
inversion.  This  gate  model  has  then  been  coded  as  a  4-terminal  SPICE  device  model  which  can  be  added 
in  parallel  to  an  existing  MOSFET,  making  it  usable  with  any  of  the  existing  MOSFET  models,  such  as 
BSIM-4.5.0  or  PSP-102.0.  The  model  has  then  used  to  simulate  a  selection  of  benchmark  circuits,  both 
digital,  analog,  and  mixed-signal,  in  several  of  the  more  promising  high -k  gate  dielectric  process 
technologies. 

1.2.6  Students  Receiving  AFRL  Fellowships 

Grayson  Dietrich,  undergraduate  (U.S.  citizen) 
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1.3  PROJECT  3:  LOW  POWER  ALL-DIGITAL  CHIP-TO-CHIP  INTERFACE 
CIRCUITS 

1.3.1  Abstract 

A  time  to  digital  converter  (TDC)  is  an  electronic  circuit  designed  to  quantize  a  time  interval  into 
a  digital  code  for  subsequent  signal  processing.  Time-to-digital  converters  have  been  used  historically  in 
applications  such  as  high  energy  physics  [24],  laser  range  finding  [25],  and  other  time-of-flight 
measurements.  Recently,  TDCs  have  become  popular  replacements  for  the  phase  detector  in  digital 
phase-locked  loops  (DPLL)  [26].  We  propose  a  novel  TDC  architecture  that  utilizes  oversampling  and 
noise-shaping  to  achieve  better  than  80dB  of  dynamic  range. 

1.3.2  Project  Description 

Modern  CMOS  processes  are  being  scaled  to  45nm  and  smaller  to  incorporate  high  digital  gate 
count  designs.  This  poses  several  challenges  to  analog  centric  designs  such  as  the  phase-locked  loop 
(PLL).  The  high  cost  and  large  area  of  precision  analog  components  (resistors  and  capacitors)  along  with 
the  high  gate  leakage  of  MOS  transistors  is  the  catalyst  behind  PLL  digital  loop  filter  research.  A  high 
resolution  time-to-digital  converter  facilitates  the  use  of  digital  loop  filters  and  is  the  focus  of  this 
research. 

1.3.3  Research  Results  and  Discussion 

Background  of  Existing  Time-to-Digital  Converter  Circuits 

Existing  time-to-digital  converter  (TDC)  circuits  are  generally  thought  to  be  similar  to  flash 
ADCs  [24-33].  Like  a  flash  ADC  which  uses  a  series  of  unit  voltage  or  current  reference  levels,  existing 
TDC  architectures  use  a  series  unit  delay  elements  to  generate  time  comparison  points.  Notable 
exceptions  are  found  in  [34]  and  [35].  Flash  TDCs  have  monotonic  input  to  output  transfer  characteristics 
and  have  time  resolution  capabilities  in  the  tens  of  picoseconds  range.  A  common  method  used  to  create 
each  unit  delay  element  is  to  use  a  CMOS  inverter.  A  typical  inverter  propagation  delay  in  a  standard 
90nm  CMOS  process  is  approximately  9-10  picoseconds  and  is  expected  to  decrease  with  decreasing 
process  geometry.  When  a  TDC  replaces  a  traditional  linear  phase-frequency  detector  (PFD)  in  a  phase- 
locked  loop  (PLL),  jitter  performance  of  the  PLL  decreases  to  a  level  proportional  to  the  TDC  time 
resolution. 

When  considering  the  TDC  for  use  in  a  PLL  the  observation  can  be  made  that  the  high  signal 
bandwidths  of  the  flash  TDC  are  excessive.  Because  of  PLL  loop  stability  requirements,  a  typical  PLL 
has  loop  bandwidth  less  than  ten  times  the  reference  frequency.  In  this  case,  oversampling  and  noise 
shaping  techniques  such  as  delta-sigma  modulation  become  attractive. 

Unlike  flash  ADCs,  delta-sigma  ADCs  convert  input  signals  using  information  residue  from 
previous  conversions  and  the  present  input.  In  a  CMOS  process,  the  residue  from  previous  conversions  is 
typically  stored  as  voltages  across  capacitors.  The  analog  to  digital  conversion  process  can  then  defined  as 
a  voltage/current  domain  input  to  voltage  domain  residue  to  digital  domain  output  sequence.  The  parallel 
of  a  delta-sigma  based  TDC  adds  a  new  dimension.  Since  the  input  is  the  phase  domain  and  it  is  difficult 
to  accurately  store  phase  residue,  the  conversion  process  is  defined  as  a  phase  domain  input  to  voltage 
domain  residue  to  digital  domain  output. 

The  challenge  is  to  accurately  convert  phase  domain  signals  to  the  voltage  domain.  Examination 
of  the  PFD  and  charge  pump  combination  of  a  typical  analog  PLL  shows  that  it  is  highly  non-linear 
circuit.  The  discouraging  fact  is  the  most  non-linear  region  occurs  around  the  PLL  lock  point,  when  the 
reference  and  feedback  edges  are  aligned.  Using  a  PFD  and  charge  pump  to  convert  the  phase  domain 
signal  would  lead  to  large  amounts  of  harmonic  distortion  and  severely  limit  a  TDC’s  resolution.  The 
PFD  exhibits  improved  linearity  as  the  difference  between  reference  and  feedback  edges  increases. 
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Architecture 


C,  R2  C2 


Figure  15.  Block  diagram  of  phase-reference  delta-sigma  time-to-digital  converter  (PR-AE  TDC) 

Figure  15  shows  the  block  diagram  for  the  proposed  TDC.  The  modulator  portion  is  classified  as 
a  continuous-time  second  order,  cascade  of  integrators  in  feedback  (CIFB)  structure  and  has  a  single  bit 
quantizer.  The  phase  detector  block  (PD)  transforms  the  phase  difference  between  and  ®R  into  a 
voltage.  The  relationship  between  the  input  phase  and  reference  phase  is  chosen  such  that  |  &IN-  <£>R\  > 
n/4  for  0IN  <  0.5FS.  The  full-scale  range  for  this  design  is  tt/2.  Linear  transconductors  GMi  and  GM2 
converts  the  voltage  representation  of  the  phase  difference  to  a  current,  and  the  R2  resistors  perform  the 
interstage  voltage  to  current  transformation.  Performing  the  1/s  Laplace  domain  operation,  the  active 
integrator  circuits  integrate  the  current,  which  is  proportional  to  the  phase  difference.  The  digital-to-phase 
converter  (DPC)  block  transforms  the  digital  output  code  into  the  phase  domain,  which  can  then  be 
processed  by  the  PD. 

The  values  of  GMi,  R2,  and  GM3  are  determined  by  first  prototyping  the  modulator  in  the  discrete¬ 
time  domain  with  a  loop  filter,  H(z)  =  (-2z+l)/(z-l)2 .  Then  applying  the  impulse  invariant  transformation 
with  information  about  the  DPC  pulse  shape  leads  to  the  continuous-time  domain  loop  filter,  H(s). 
Knowing  H(s)  and  solving  the  linear  equations,  the  values  of  GMi,  R2,  and  GM3  can  be  determined.  Using 
dynamic  range  scaling  techniques,  the  maximum  output  swings  can  be  optimized  for  power  supply 
voltage  and  op-amp  output  compliance  range. 

Careful  consideration  must  be  given  to  the  unity-gain  bandwidth  of  the  op-amps.  Simulations 
have  shown  that  the  first  op-amp  unity-gain  bandwidth  (UGB)  should  be  six  to  seven  times  the  input 
clock  frequency  to  minimize  in-band  harmonic  distortion.  The  UGB  requirement  of  the  second  op-amp  is 
not  as  high;  therefore,  its  bandwidth  may  be  reduced  to  reduce  power  consumption.  For  this  design,  the 
UGB  of  the  second  op-amp  is  3  times  lower  than  the  first  op-amp. 

Test  Chips 

None  scheduled  for  this  project. 

1.3.4  Other  Results 

1.3. 4.1  Technology  Transfer/Intellectual  Property 
None  yet. 

1.3. 4.2  Publications  and  Presentations 

None  yet. 

1.3.4.3  Benefits  to  Commercial  Sector 
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The  proposed  digitally  intensive  calibration  schemes  will  enable  implementing  precision  analog 
functions  in  processes  that  experience  very  large  parameter  variations.  In  particular,  the  proposed  design 
techniques  will  enable  all-digital  links  with  better  than  5mW/Gbps  power  efficiencies. 

1.3.5  Conclusions 

The  phase-reference,  delta-sigma  time-to-digital  converter  demonstrated  a  technique  to  use  noise 
shaping  and  to  operate  a  phase  detector  in  its  linear  region  to  process  phase  information  in  the  phase 
domain  to  dramatically  improve  the  resolution  of  the  time-to-digital  circuit.  The  output  spectrum  for  the 
PR-AE  TDC  with  a  0.5FS  input  is  shown  in  Figure  16.  The  spectrum  shows  second  order  noise  shaping, 
close  correlation  with  the  ideal  noise  transfer  function,  and  limited  harmonic  distortion.  Table  4  shows  the 
performance  summary  for  the  proposed  TDC.  When  fabrication  and  testing  are  complete,  the  PR-AE  TDC 
is  expected  to  exceed  existing  TDC  performance  by  up  to  an  order  of  magnitude. 


Figure  16.  PR-AE  TDC  output  spectrum  for  0.5FS  input 


Table  4.  PR-AE  TDC  Performance  Summary 


Parameter 

Value 

Units 

Clock  Frequency 

156.25 

MHz 

Signal  Bandwidth 

610 

kHz 

VDD 

1.0 

V 

Power 

1.5 

mW 

SNR 

83 

dB 

SB 

140 

Fs 
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1.3.6  Students  Receiving  AFRL  Fellowships 

Brian  Young,  graduate  student  (U.S.  citizen) 

1.4  PROJECT  4:  NANOSCALE  CLOCK  AND  DATA  RECOVERY  CIRCUITS 

1.4.1  Abstract 

Many  applications  require  clock  and  data  recovery  (CDR)  circuits  for  high-speed  data 
communication.  This  project  researches  clock  and  data  recovery  circuits  (CDRs)  in  nanometer  CMOS 
technology  to  have  high  tolerance  to  device  and  process  variations,  extremely  wide  tuning  range,  very 
low  jitter,  low  power  and  require  only  a  small  amount  of  layout  area.  The  robustness  to  process  variations, 
small  area  and  low  supply  voltages  is  a  result  of  minimizing  the  analog  functions  to  calibrated  digital- 
controlled  delays.  A  novel  digital-controlled  clock  synthesizer  (DCS)  replaces  the  voltage  controlled 
oscillator  (VCO)  in  the  phase-locked  loop  (PLL)  of  the  CDR  to  enable  instantaneous  frequency  hopping 
and  operation  at  all  data  rates  below  a  maximum  frequency  of  5  Gbps.  The  CDR  is  reconfigurable  to 
operate  over  many  protocols  and  data  rates.  The  CDR  is  designed  with  hardness-by-design  techniques  and 
with  sufficient  margin  to  handle  the  wide  process  variations  and  a  large  temperature  range.  Because  no 
analog  filter  is  required  in  the  PLL,  die  area  is  reduced  substantially. 

1.4.2  Project  Description 

Integrated  clock  and  data  recovery  circuits  are  needed  for  high-speed  data  communications  in 
many  applications.  There  is  an  increasing  demand  for  higher  and  higher  data  rates  and  support  for  many 
different  protocols.  The  communication  can  be  between  remote  locations,  between  ICs  on  the  same  or 
different  boards  in  a  system  or  even  between  different  modules  on  the  same  IC.  Nanoscale  CMOS 
electronics  enable  communication  data  rates  well  over  10  Gbps  but  there  are  challenges  to  optimize 
performance  and  minimize  area  and  power  dissipation.  This  project  is  to  develop  a  robust  low-power  low- 
jitter  small-area  CDR  circuit  in  nanoscale  CMOS  technology  that  can  operate  at  all  data  rates  below  a 
maximum  frequency  of  5  Gbps  in  90  nm  CMOS.  This  flexibility  will  enable  CDRs  to  communicate  with 
most  of  the  existing  protocols. 

Designing  a  robust  CDR  using  nanoscale  CMOS  devices  with  small  die  area  is  challenging. 
Nanoscale  devices  have  increased  process  and  device  variability,  which  makes  design  of  analog  circuits 
more  difficult.  In  addition  breakdown  voltages  are  smaller  requiring  lower  supply  voltages  and  decreasing 
signal-to-noise  ratios  of  analog  circuits.  Die  area  is  also  an  important  issue.  To  obtain  low  jitter  using  the 
traditional  approach  of  using  a  phase-lock  loops  (PLLs)  requires  a  narrow  bandwidth,  which  implies  large 
capacitors  and/or  resistors.  Capacitors  and  resistors  do  not  scale  with  the  technology  and  relative  area 
increases  compared  to  digital  circuits  that  do  scale. 

The  proposed  digital-controlled  clock  synthesizer  (DCS)  uses  digital  logic  to  set  the  period  of  the 
output  to  be  an  integer  number  of  reference  clocks  plus  an  interpolated  value  between  clock  transitions  by 
delaying  the  output  using  a  digital-to-delay  converter  (DDC)  [36].  This  is  similar  to  the  operation  of  a 
direct  digital  frequency  synthesizer  without  the  digital-to-analog  converter.  The  only  analog  component 
required  is  the  digital-to-delay  converter  which  will  have  trimmable  delay  elements  that  can  be  calibrated 
for  reduced  sensitivity  to  process  and  device  variations.  Other  advantages  of  this  approach  are  the 
immediate  frequency  hopping  ability  and  no  jitter  accumulation.  Die  area  is  also  reduced  substantially 
since  there  is  no  analog  filter  required.  Architectures  will  be  investigated  to  achieve  a  clock  frequency  of 
5  GHz  with  a  DDC  resolution  of  1  ps  in  IBM’s  9RF  90  nm  CMOS  process.  This  is  about  an  order  of 
magnitude  increase  in  frequency  over  previous  work  and  is  partially  due  to  a  novel  implementation  and 
partially  due  to  using  a  faster  90  nm  process. 

Figure  17  shows  a  block  diagram  of  the  proposed  DCS.  The  DCS  requires  an  input  reference 
clock  with  low  jitter  and  a  digital  word  representing  the  period  T  of  the  output  clock  divided  by  the 
reference  clock  period  Tclk.  Writing  T  =  2(N  +  R)*Tclk,  where  N  is  a  positive  integer  and  R  is  the 
fractional  remainder  <  Tclk,  the  output  clock  can  be  generated  by  toggling  the  output  after  each  delay  of 
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N  reference  clock  periods  plus  R*Tclk.  For  example,  if  Tclk  is  200  ps  (5  GHz)  and  T  is  1064  ps  (0.93985 
GHz),  then  N  =  2  and  R  =  0.66.  The  output  is  toggled  at  times  0,  532  ps,  1064,  1596  ps,  etc.  An 
accumulator  is  used  to  determine  the  fractional  delays.  The  carry  output  of  the  fractional  delay 
accumulator  signals  an  extra  cycle  of  delay  Tclk.  A  10-bit  value  of  N  allows  the  clock  period  to  range 
from  Tclk/1024  to  Tclk  or  from  5  MHz  to  5  GHz  with  Tclk  =  200  ps.  A  24-bit  delay  accumulator 
provides  a  1  Hz  frequency  resolution  and  an  8-bit  vernier  delay  line  provides  sub-picosecond  resolution 
with  Tclk  =  200  ps.  The  output  frequency  can  be  changed  almost  instantly  by  changing  the  input  control 
word. 


Figure  17.  Block  diagram  of  digitally-controlled  clock  synthesizer 

The  DCS  needs  a  fixed  frequency  clock  reference  as  an  input.  Because  there  is  no  jitter 
accumulation  in  the  DCS,  the  jitter  can  be  nearly  the  same  as  the  clock  reference  assuming  that  the 
digital-to-delay  converter  has  low  jitter.  Fast  rise  and  fall  times  in  90  nm  technology  will  aid  in  achieving 
this  low-jitter  goal  for  the  DDC.  Since  the  clock  reference  is  at  a  fixed  frequency  and  does  not  need  to  be 
tunable,  lower  jitter  is  easier  to  implement  on-chip.  A  fixed  external  reference  with  very  low  jitter  can 
also  be  used  and  the  proposed  synthesizer  can  generate  all  other  frequencies  needed  by  the  IC  and 
maintain  the  very  low-jitter. 

The  DCS  will  be  used  to  implement  a  CDR  circuit.  The  phase  detector  will  drive  a  digital  filter  to 
control  the  clock  synthesizer  frequency,  avoiding  an  analog  loop  filter.  Digital  controlled  oscillators  have 
been  designed  using  a  time-to-digital  phase  detector  and  a  digital  filter  to  drive  a  digital-controlled 
oscillator  [37].  However,  the  digital-controlled  oscillator  is  typically  a  ring  oscillator  or  an  LC  oscillator. 
Ring  oscillators  have  a  wide  tuning  range  and  require  small  area  but  have  higher  phase  noise.  LC 
oscillators  have  lower  phase  noise  and  a  narrow  tuning  range  but  require  larger  area.  Both  of  these 
approaches  suffer  from  accumulated  jitter. 

The  result  of  this  research  will  be  a  CDR  circuit  that  is  reconfigurable  for  operation  at  any 
frequency  up  to  5  Gbps  with  very  low  jitter,  is  robust  to  radiation  effects,  temperature,  and  process 
variations,  scales  with  process  for  low  area,  and  has  low  power  dissipation. 

1.4.3  Research  Results  and  Discussion 

The  design  initially  was  intended  to  use  STMicroelectronics  65  nm  process,  but  switched  to  the 
90  nm  IBM  process  to  reduce  fabrication  cost.  Porting  the  designs  took  about  2  months  since  some 
circuits  needed  modification  and  the  schematics  for  digital  logic  had  to  be  created  from  scratch. 
Simulations  show  that  a  5  GHz  clock  frequency  is  attainable  under  nominal  conditions.  The  vernier 
needed  to  be  modified  to  work  with  the  IBM  process  and  we  needed  to  extend  the  range  to  200  ps.  A  new 
counter  design  using  a  Johnson  counter  and  a  ripple  counter  was  implemented.  We  also  designed  the 
RAM  to  hold  calibration  values  and  solved  some  system  timing  issues.  System  level  simulation  of  the 
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DCS  showed  functionality.  All  of  the  blocks  for  the  DCS  are  now  designed  and  laid  out.  Lay  out  of  the 
DCS  is  nearly  complete. 

Reverse  body  bias  (RBB)  was  used  on  the  substrate  to  provide  radiation  hardness  [38].  Last 
period  we  modified  LGEN  [39],  a  C++  layout  generator  developed  in  our  group,  to  provide  an  option  to 
generate  parameterized  CMOS  standard  cells  with  a  separate  reverse  body  bias.  We  entered  IBM’s  design 
rules  into  the  LGEN  database  and  then  were  able  to  quickly  generate  layouts  of  the  same  CMOS  logic 
gates  (nand,  nor,  inverters,  some  and-or-invert  gates  and  latches)  in  this  process. 


Period  Control  Word 


Figure  18.  Block  diagram  of  delay  accumulator  and  vernier  delays 

Figure  18  shows  a  simplified  block  diagram  of  the  phase  accumulator  and  delay  lines.  By 
implementing  the  PA  with  only  full  adders  with  no  carry  propagation,  the  sum  consists  of  the  sum  of  two 
words,  the  output  sum  plus  two  times  the  carry  output.  If  these  two  words  control  two  delay  lines  in 
series,  the  delay  will  be  the  same  as  if  the  complete  sum  was  generated.  This  approach  greatly  reduces  the 
area  and  power  dissipation.  The  upper  8  bits  of  the  24-  bit  delay  accumulator  use  triple-mode  redundancy 
(TMR)  to  reduce  the  single-event  upset  rate.  Upsets  in  the  lower  bits  have  only  a  minor  effect  on  the 
accuracy  of  the  DCS.  Implementing  TMR  in  the  delay  accumulator  increases  logic,  power  and  layout  area 
by  50%. 

Each  vernier  delay  line  shown  in  Figure  2  was  implemented  with  3  delay  lines.  The  delay  values 
are  changed  every  clock  cycle  and  interleaving  between  two  delay  lines  increases  accuracy  by  allowing 
changes  in  delay  to  settle  for  a  clock  cycle  before  being  used  the  next  cycle.  The  third  delay  line  enables 
calibration  to  be  performed  without  interrupting  normal  operation. 

The  vernier  delay  lines  in  Figure  19  need  to  have  a  delay  range  of  200  ps  for  5  GHz  operation.  To 
minimize  jitter  in  the  DCS,  the  delay  of  the  delay  line  must  be  linear.  We  investigated  several  different 
implementations  of  the  vernier  delay  lines.  The  CML  delay  line  dissipated  much  more  power  so  we 
selected  the  CMOS  inverter  delay  line.  Switches  provide  different  load  capacitances  to  the  inverters  to 
control  the  delay.  Figure  19  shows  one  stage  of  this  delay  line.  The  delay  is  very  sensitive  to  supply 
variations  so  we  will  use  a  separate  regulated  supply  just  for  the  delay  lines. 
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Calibration  is  performed  by  forming  a  ring  oscillator  with  a  delay  line  and  a  fixed  delay  of  about 
800  ns.  The  value  of  the  800  ns  delay  is  not  critical  and  is  included  to  reduce  the  oscillator  rate  so  a  low 
power  counter  can  be  used.  The  delay  value  is  set  to  its  minimum  value  and  the  number  Nmin  of  rising 
edges  occurring  during  P  clock  cycles  is  measured.  P  needs  to  be  several  thousand  cycles  to  provide 
picosecond  resolution  and  we  use  P  =  213.  The  maximum  delay  should  be  200  ps  for  a  5  GHz  clock  period 
so  a  resolution  of  200  ps/192  =  1.04  ps  was  chosen.  To  calibrate  each  delay  value  i  from  1  to  191  the 

•  P  •  192 

number  of  rising  edges  occurring  during  P  clock  cycles  should  be  j\(i )  = _ uim _ Since  the 

2-i-Nmin+P-192 

calibration  occurs  in  the  background,  speed  is  not  an  issue.  If  P  is  213,  then  multiplication  by  192*P  can  be 
computed  with  a  single  addition  and  some  shifts.  Performing  1  bit  per  cycle  with  a  GHz  or  lower  clock 
frequency  enables  the  division  to  be  implemented  with  a  simple  low-power  ripple  adder. 


Figure  20.  Block  diagram  vernier  delay  line 

The  architecture  of  the  vernier  delay  line  is  shown  in  Figure  20.  The  desired  N(i)  for  i  =  0  to  3 1  is 
determined  and  a  search  is  performed  to  find  the  6-bit  vernier  input  that  is  closest  to  it.  This  is  stored  in 
the  32x6  RAM.  The  delay  blocks  with  a  nominal  value  of  32  ps  are  not  adjustable.  Instead  a  separate 
vernier  delay  is  added  in  series.  A  32  x  9  bit  RAM  provides  control  values  to  select  how  many  32ps 
delays  are  required  in  addition  to  the  buffer  offset  vernier  delay  setting  will  result  in  the  desired  value 
N(8*i)  for  i  =  5  to  23.  The  top  5  MSbs  of  the  8-bit  vernier  input  control  the  address  to  this  RAM.  Ideally 
this  RAM  would  only  need  to  be  8  x  9  bits  if  the  addition  of  delays  was  perfectly  linear  over  32  steps. 
However  this  may  not  be  the  case  so  every  8  steps  are  calibrated.  The  additional  RAM  required  is  small. 
The  total  RAM  required  is  480  bits. 

The  calibration  algorithms  were  developed  in  Verilog  using  mixed-mode  simulations  of  the  delay 
lines  and  calibration  RAM.  After  being  able  to  calibrate  a  single  delay  line,  we  expanded  the  Verilog  code 
to  control  calibration  of  all  three  delay  lines. 
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Figure  21.  Layout  of  the  DCS 


Test  Chips 

Figure  21  shows  the  layout  of  the  DCS  without  the  counters  and  digital  control  logic.  The  24-bit 
delay  accumulator  is  at  the  bottom  and  the  three  delay  lines  are  on  top.  The  size  is  about  0.8  mm  x  0.4 
mm.  Much  of  the  area  in  the  delay  lines  is  taken  up  with  memory. 

The  blocks  are  all  laid  out  but  not  connected.  Post-layout  simulations  have  been  performed  on 
most  of  the  blocks.  The  verified  blocks  will  then  be  combined  and  post-layout  simulations  on  the  entire 
DCS  will  be  performed.  During  fabrication,  we  can  prepare  all  the  test  fixtures  and  set  up  the  test 
equipment  in  order  to  complete  characterization  quickly.  We  will  use  RF  wafer  probes  to  provide  the 
reference  clock  and  the  high-speed  inputs  and  outputs. 

The  original  goals  were  to  design,  layout,  fabricate  and  characterize  a  CDR  using  a  DCS.  The 
design  and  most  of  the  layout  has  been  completed.  Since  Bill  Hamon  graduated,  another  student  needs  to 
be  assigned  to  this  project  and  he  will  need  a  two  or  three  months  to  finalize  the  layout  and  verify  the 
design  through  post-layout  simulations.  There  are  several  reasons  why  the  goals  were  not  been  met.  The 
original  schedule  was  far  too  aggressive.  The  student  was  a  Master’s  student  with  no  previous  design  or 
layout  experience,  which  was  not  reflected  in  the  schedule.  We  started  out  using  ST  Microelectronics  65 
nm  process  and  changed  to  the  IBM  90  nm  process  to  reduce  fabrication  costs,  which  cost  us  a  couple  of 
months  time.  In  an  attempt  to  accelerate  the  project,  I  assigned  a  student  to  help  develop  and  simulate  the 
background  calibration  algorithm.  Two  students  helped  Bill  with  layout. 

This  project  will  result  in  a  clock  and  data  recovery  circuit  in  90  nm  technology  that  is  very 
tolerant  of  device  and  process  variations,  has  an  extremely  wide  tuning  range,  low  jitter,  low  power 
dissipation  and  only  requires  a  small  amount  of  layout  area.  The  robustness  to  process  variations  is  a 
result  of  minimizing  the  analog  functions  to  programmable  delays  that  can  be  trimmed  and  calibrated. 

The  CDR’s  wide  tuning  range  along  with  low-jitter  and  almost  instantaneous  frequency  hopping  provides 
versatility  that  can  be  used  for  many  applications.  The  novel  delay  accumulator  design  will  result  in  low 
power.  A  90  nm  digital  standard  cell  library  using  reverse  body  bias  was  developed  to  provide  total  dose 
radiation  hardness  for  use  in  the  CDR. 

1.4.4  Other  Results 

1.4.4.1  Technology  Transfer/Intellectual  Property 

None  yet,  but  it  may  be  possible  to  obtain  a  patent  on  the  delay  accumulator  architecture. 

1. 4.4.2  Publications  and  Presentations 

Bill  Hamon’s  Master’s  Thesis  is  available  and  contains  more  details  about  the  DCS.  We  plan  to 
present  simulation  results  at  the  Workshop  for  Microelectronics  and  Electron  Devices  in  Boise  in  April 
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2009,  and  publish  a  journal  article  after  measurements  are  completed.  (If  the  measurement  results  show 
significant  benefits,  efforts  will  be  made  to  transfer  this  technology  to  interested  companies.) 

1. 4.4.3  Benefits  to  Commercial  Sector 

This  project  falls  under  two  areas  of  the  AFRL  research  agenda:  Topic  1:  Design  techniques  for 
high-process  variability  and  Topic  3:  Reconfigurable  Mixed-Signal  Electronics.  Our  approach  is  mostly 
digital  in  nature  and  can  tolerate  large  process  variations  and  low  power  supply  voltages.  To  compensate 
for  device  variation  due  to  temperature  and  radiation,  calibration  will  be  performed  regularly  on  the 
digital-controlled  delays  without  disturbing  normal  operation.  Our  digital-controlled  synthesizer  provides 
very  fast  frequency  hopping,  extremely  wide  tuning  range  and  higher  frequency  resolution  along  with 
lower  jitter  than  phase-locked  loop  (PLL)  solutions.  As  a  result,  communication  systems  that  use  our 
CDR  are  more  versatile  than  analog  VCO  based  synthesizers  and  are  reconfigurable  after  deployment  to 
meet  a  wide  variety  of  applications.  A  90  nm  radiation-hard  standard  cell  library  was  created  as  part  of 
this  research. 

Digital  microprocessors,  FPGAs  and  ASICs  contain  many  PLLs  and  DLLs  to  handle  clock  skew 
and  high-speed  I/O.  The  commercial  sector  will  benefit  from  the  smaller  area  and  lower  jitter  from  the 
DCS  plus  the  fact  that  the  DCS  can  generate  all  the  clock  frequencies  needed  for  an  entire  chip.  DLLs  are 
also  mostly  digital  and  therefore  small  but  the  DCS  is  capable  of  synthesizing  frequencies  with  higher 
accuracy  and  lower  jitter.  Many  applications  in  commercial  communication  systems  can  also  benefit  from 
the  versatility  of  this  DCS  and  CDR  because  of  the  extremely  large  tuning  range,  low  jitter,  small  size  and 
fast  frequency  hopping  ability. 

1.4.5  Conclusions 

The  digital-controlled  synthesizer  (DCS)  provides  very  fast  frequency  hopping,  extremely  wide 
tuning  range  and  higher  frequency  resolution  along  with  lower  phase  noise  than  phase-locked  loop  (PLL) 
solutions.  As  a  result,  communication  systems  that  use  this  CDR  are  more  versatile  than  analog  VCO 
based  synthesizers  and  are  reconfigurable  after  deployment  to  meet  a  wide  variety  of  applications.  In 
addition,  A  90  nm  radiation-hard  standard  cell  library  was  created  as  part  of  this  research. 

1.4.6  Students  Receiving  AFRL  Fellowships 

Bill  Hamon,  graduate  student;  Dirk  Robinson,  graduate  student;  and  Dan  Hubert,  undergraduate 
student.  All  three  students  are  U.S.  citizens. 
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1.5  PROJECT  5:  COUPLED  DEVICE  AND  CIRCUIT  SIMULATION  FOR 
ANALYZING  THE  EFFECT  OF  RANDOM  DOPANT  AND  GEOMETRY 
FLUCTUATIONS  IN  ANALOG/RF  INTEGRATED  CIRCUITS 

1.5.1  Abstract 

A  coupled  device/circuit  simulator,  CODECS,  was  enhanced  to  include  coupled  device  and  circuit 
level  sensitivity  analysis  for  determining  the  impact  of  random  fluctuations  on  the  performance  of 
analog/RF  circuits.  Since  the  device  simulator  models  (numerical  models)  are  predictive,  they  can  be  used 
to  evaluate  the  impact  of  technology  on  circuit  performance.  Coupled  device/circuit  simulation  is 
essential  for  the  analysis  of  random  fluctuations  in  analog/RF  ICs  fabricated  in  nanoscale  process 
technologies. 

1.5.2  Project  Description 

Because  of  the  shrinking  size  of  devices  in  scaled  technologies,  random  fluctuations  in  the 
implanted  dopants  and  geometry  (e.g.,  oxide  thickness,  channel  length  and  width)  have  a  significant  effect 
on  the  device  characteristics.  These  in  turn  affect  the  performance  of  critical  mixed-signal  analog/RF 
circuitry.  The  accurate  matching  of  devices  is  an  important  consideration  for  analog  circuit  design,  and 
knowing  the  impact  of  random  fluctuations  is  essential  for  scaling  of  these  circuits  to  very  deep 
submicron  process  technologies  [40]. 

Although  the  study  of  random  fluctuations  in  semiconductor  devices  has  gained  significant 
attention  [41-43],  their  impact  on  circuit  performance  has  not  been  evaluated.  Traditional  approaches  for 
analyzing  these  effects  at  the  device  level  employ  computationally  expensive  Monte-Carlo  methods. 
Recently,  sensitivity-based  methods  have  been  applied  to  evaluate  the  effect  of  fluctuations  at  the  device 
level  [44-45].  These  methods  are  computationally  efficient  and  are  suitable  for  combining  device  and 
circuit  level  sensitivity  analyses  to  directly  compute  the  effect  of  random  dopant  and  geometry 
fluctuations  on  circuit  performance.  A  coupled  device/circuit  simulator  provides  a  direct  link  between  the 
IC  fabrication  technology,  device  design  and  the  higher  level  of  circuit  design  and  is  an  excellent  platform 
for  these  analyses. 

Since  the  models  from  the  device  simulator  (numerical  models)  are  predictive,  they  can  be  used 
to  evaluate  the  impact  of  technology  on  circuit  performance.  This  is  extremely  useful  when  experimental 
data  is  not  available  to  characterize  the  variability  in  a  new  process.  In  addition,  the  circuit  layout 
information  is  connected  to  the  simulator  which  is  necessary  for  analyzing  the  impact  of  mismatches  due 
to  layout  related  issues.  In  deep  submicron  processes,  the  layout  of  a  transistor  and  its  proximity  to  other 
devices  has  a  significant  impact  on  the  performance  of  the  transistor  under  consideration  [46]. 
Furthermore,  standard  mismatch  models  such  as  the  Pelgrom  model  [47]  are  not  physically  correct  for 
CMOS  technologies  that  use  halo  or  pocket  implants  [48].  This  implies  that  improved  physical  modeling 
of  variations  is  important  for  deep  submicron  processes. 

In  this  project,  the  capabilities  of  the  coupled  simulator  CODECS  [49]  were  enhanced  to  include 
coupled  device  and  circuit  level  sensitivity  analysis  for  determining  the  impact  of  random  fluctuations  on 
the  performance  of  analog/RF  circuits.  The  goal  was  to  develop  a  simulator  that  is  capable  of  predicting 
the  effects  of  process  fluctuations  on  the  performance  of  critical  analog/RF  circuitry.  Since  the  device 
simulator  models  (numerical  models)  are  predictive,  they  can  be  used  to  evaluate  the  impact  of 
technology  on  circuit  performance.  Coupled  device/circuit  simulation  is  essential  for  the  analysis  of 
random  fluctuations  in  analog/RF  ICs  fabricated  in  nanoscale  technologies.  A  soon  to  be  released 
simulator,  RandFlux  [50],  from  Florida  State  University  has  a  comprehensive  device  simulation 
capability  for  random  fluctuations  that  has  been  combined  with  circuit  level  analysis.  This  work  is  very 
comprehensive  and  will  supersede  our  work. 
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1.5.3  Research  Results  and  Discussion 

First  we  implemented  dc  (static)  sensitivity  calculations  in  CODECS  based  on  [51].  We  then 
extended  the  calculation  of  sensitivities  to  doping  profile  for  one-dimensional  (ID)  devices,  i.e.,  diodes 
and  BJTs  for  transient  calculations.  The  general  formulation  of  device  level  sensitivities  can  be  described 
in  terms  of  the  Poisson’s  equation  and  the  electron-  and  hole-current  continuity  equations. 


where 
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and  £  is  the  dielectric  constant  of  the  material,  q  is  the  electronic  charge,  iff  is  the  electrostatic 
potential,  n  ( p )  is  the  electron  (hole)  concentration,  E  is  the  electric  field,  Jn  (J  )  is  the  electron 
(hole)  current  density,  /un  ( jup  )  is  the  electron  (hole)  mobility,  Dn  (Dp  )  is  the  electron  (hole) 

diffusivity,  N D  ( N A  )  is  the  donor  (acceptor)  concentration,  and  G  and  R  are  net  generation  and 
recombination  rates,  respectively. 

The  solution  of  the  above  system  of  equations  provides  the  internal  distributions  of  the 
electrostatic  potential  and  the  carrier  densities,  and  the  external  terminal  currents.  These  equations  cannot 
be  solved  analytically  and  numerical  methods  have  to  be  used.  The  equations  are  discretized  in  the  space 
over  a  simulation  domain  and  then  time  discretization  is  used  to  solve  the  transient  problem. 

Consider  a  ID  device  for  which  the  space-discretization  is  performed  in  the  x-dimension.  A  one¬ 
dimensional  grid  for  space  discretization  is  shown  in  Fig.  22. 
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Figure  22.  Schematic  of  the  grid  used  for  ID  finite-difference  space  discretization 


The  discretized  equations  for  dc  analysis  in  terms  of  nodal  quantities  (y/i ,  ni ,  pt )  after 
appropriate  scaling  [49]  are  given  by 
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The  current  density  terms  in  Eqs.  (2b)  and  (2c)  are  given  by 
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where  B(x)  = -  is  the  Bernoulli’s  function. 

<?x  -1 


(9) 


(10) 

(11) 


In  the  discretized  equations  the  doping  levels  appear  through  the  dependence  of  the  mobilities  and 
the  net  generation/recombination  rates  on  the  doping. 

After  space  discretization,  a  system  of  nonlinear  differential-algebraic  equations  is  obtained.  The 
complete  system  of  equations  can  be  represented  in  a  general  form  as: 

F(w,w,V(t))  =  0  (12) 


where  w  is  the  vector  of  electrostatic  potential,  y/ ,  electron  concentration,  n  ,  and  hole 
concentration,  p  ,  at  each  grid  node,  i.e.,  (i//i ,  ni ,  pt )  for  i  =  1,  ...,  N,  w  =  dwl dt ,  and  V(t)  is  a  time- 
dependent  voltage  applied  to  the  device. 

Next  assume  an  analytical  doping  profile,  with  D  being  the  doping-profile  parameter;  D  is  the  peak 
concentration  or  characteristic  length  for  a  Gaussian  profile,  or  the  concentration  for  a  uniform  doping. 
The  implicit  dependence  of  w  on  D  can  be  incorporated  in  Eq.  (12),  whereby 

F(w(D)MD),D,V(t))  =  Q  (13) 


To  compute  the  differential  sensitivities  Eq.  (13)  is  differentiated  with  respect  to  the  doping- 
profile  parameter  D  to  obtain 
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Eq.  (5)  can  be  rearranged  as 
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dw 

From  Eq.  (15)  the  transient  sensitivities - =  u  are  calculated  as  a  time-dependent  waveform 

dD 


from  the  solution  of  a  linear  differential  equation 
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dF  dF 

- and -  are  available  from  the  time-domain  solution  of  the  device-level  equations  and 

dw  dw 


only  the  solution  of  a  linear  differential  equation  is  required  for  calculating  the  sensitivities.  Therefore, 
this  is  a  computationally  efficient  calculation  [52],  The  initial  condition  for  calculating  the  transient 
sensitivities  is  obtained  from  a  dc  sensitivity  calculation. 

From  (13),  one  obtains  the  dc  sensitivities  by  elimination  of  the  time  dependent  terms. 

Then 


F(w(D),D)  =  0 


(17) 


Eq.  (8)  is  differentiated  with  respect  to  the  doping-profile  parameter  D  to  obtain 
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dw 

From  Eq.  (9)  the  dc  sensitivities - are  calculated  as 
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dw 
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-  is  available  in  LU-factors  from  the  solution  of  the  device-level  equations  under  dc 

dw 


conditions  and  only  forward  and  back  substitutions  are  required  to  calculate  the  dc  differential 
sensitivities  [52]. 

Device  level  sensitivity  calculations  based  on  Eqs.  (16)  and  (19)  have  been  implemented  in 
CODECS  for  ID  diodes  and  bipolar  junction  transistors.  Using  this  result  the  sensitivity  of  the  terminal 
currents  to  the  doping  levels  can  also  be  computed.  This  information  is  used  by  the  circuit  level 
simulator  to  compute  the  sensitivity  of  circuit  node  voltages  to  doping  profiles. 

At  the  circuit  level,  the  equations  that  are  solved  at  each  node  are  the  KCL  equations.  At  a 
particular  node  to  which  a  numerical  device  is  connected,  the  KCL  can  be  expressed  as: 

...  +  Id(D,V(t))  =  0  (20) 


Where  I d  (D,  V (i t ))  is  the  current  flowing  into  the  node  from  the  numerical  device  and  the  rest  of 
the  current  contributions  are  shown  by  . . . 

The  sensitivity  of  the  circuit  level  to  doping  profiles  is  obtained  by  differentiating  (20)  with 
respect  to  the  doping  parameter  D.  This  results  in  Eq.  (21). 
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Where  — —  is  the  sensitivity  of  the  device  current  to  the  doping  profile  and  is  known  from  the 
dD 


device  level  sensitivity  analysis.  - is  the  sensitivity  of  the  node  voltage  to  the  parameter  D  and  can  be 

dD 

calculated  from  (21).  In  this  manner  the  sensitivity  of  the  node  voltages  to  the  doping  profile  can  be 
obtained.  Eq.  (21)  forms  the  basis  for  the  coupling  of  the  device  and  circuit  level  sensitivity  analyses. 

The  extension  of  the  ID  analyses  to  two-dimensions  is  possible  by  considering  a  discretization  of 
the  device  level  partial  differential  equations  in  two  space  dimensions  as  shown  in  Fig.  23. 
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(xi>yj+ 1) 


Figure  23.  Schematic  of  the  grid  used  for  2D  finite-difference  space  discretization 


The  following  discretizations  for  the  Possion’s  and  the  current-continuity  equations  can  be 


written: 
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In  the  above  equations  the  values  of  i?x  ,  Jwx  ^  and  Jpx  y  are  required  at  the  midpoints  of  each 
interval  [x. ,  x.+1  ]  or  [y. ,  yv+1  ].  These  values  can  be  approximated  from  the  nodal  values  of  the 
electrostatic  potential  and  the  carrier  concentrations  and  are  given  as: 
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The  rest  of  the  sensitivity  calculations  proceed  as  described  by  Eqs.  (12)  through  (21). 
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From  the  sensitivity  calculation  several  results  can  be  demonstrated.  First  consider  a  ID  pn- 
junction  and  the  sensitivity  of  the  dc  forward  current  to  the  doping  level  in  the  n-type  region.  The 
sensitivity  can  be  computed  based  on  the  differential  sensitivities  computed  from  Eq.  (19)  or  using  a 
perturbation  approach.  In  the  latter,  the  current  is  computed  for  two  different  values  of  doping  and  the 
sensitivity  is  computed  numerically  by  computing  the  differences  in  currents  as  shown  in  Eq.  (28). 
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For  small  values  of  AN  the  results  computed  from  Eq.  (28)  should  be  in  agreement  with  those 
determined  from  (16)  or  (19).  This  is  shown  to  be  the  case  in  Fig.  24,  where  the  dc  current  sensitivity  to 
doping  is  shown  as  a  function  of  the  forward  voltage.  For  small  values  of  AN  the  two  calculations  are  in 
excellent  agreement. 


Figure  24.  Sensitivity  of  the  dc  forward  current  of  a  pn-junction  diode  to  the  doping  level  in  the 
n-type  (lower  doping)  region.  Also  shown  are  the  sensitivities  computed  from  a  numerical  perturbation 

approach  with  AN  variations  of  1%,  10%,  and  50% 

Next  consider  this  pn-junction  diode  and  the  calculation  of  the  sensitivity  of  the  transient  current 
to  the  doping  level  in  the  n-type  region.  The  sensitivity  can  be  computed  based  on  the  differential 
sensitivities  computed  from  Eq.  (16)  or  using  a  perturbation  approach.  In  the  latter,  the  current  is 
computed  for  two  different  values  of  doping  and  the  sensitivity  is  computed  numerically  by  computing 
the  differences  in  currents  as  shown  in  Eq.  (28). 

A  voltage  ramp  is  applied  to  the  diode  as  shown  in  Fig.  25  where  the  transient  current  sensitivity 
to  doping  is  shown  as  a  function  of  time.  For  a  small  value  of  AN  the  two  calculations  (Eqs.  (16)  and 
(28)  are  in  excellent  agreement. 

The  final  example  is  that  of  an  npn  BJT.  The  sensitivity  of  the  transient  collector  and  emitter 
currents  is  computed  with  respect  to  the  doping  level  in  the  collector  regions.  In  Fig.  26  these 
sensitivities  are  shown  along  with  the  values  computed  using  Eq.  (28)  with  a  AN  of  10%.  The  two 
calculations  are  in  very  good  agreement. 

These  ID  examples  demonstrate  that  the  underlying  sensitivity  calculations  are  correct  and  one 
can  easily  predict  the  sensitivities  of  device  currents  to  doping  profile  variations.  Based  on  this 
foundation  we  have  completed  the  formulations  for  the  2D  sensitivity  calculations  and  their 
implementation.  The  implementation  was  being  debugged  and  tested  when  we  found  a  simulator  named 
RandFlux  [50]  that  is  being  developed  for  release  by  Florida  State  University.  This  simulator  has  a  very 
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comprehensive  device/circuit  simulation  capability  and  also  includes  random  fluctuation  calculations  at 
the  circuit/device  level  for  dc  and  ac  analysis.  We  are  currently  looking  into  more  details  of  this 
simulator  and  since  RandFlux  will  supersede  our  work,  we  have  stopped  further  debugging  and  testing 
of  CODECS  enhanced  for  sensitivity  analysis  to  doping  profiles. 

This  work  is  still  unique  in  that  it  is  the  only  one  that  has  addressed  the  calculation  of  transient 
sensitivities  to  doping  profiles. 


Figure  25.  Sensitivity  of  the  transient  forward  current  of  a  pn-junction  diode  to  the  doping  level 
in  the  n-type  (lower  doping)  region.  Also  shown  are  the  sensitivities  computed  from  a  numerical 
perturbation  approach  with  AN  variations  of  10% 
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Figure  26.  Sensitivity  of  the  transient  collector  and  emitter  currents  of  a  npn  bipolar  transistor  to 
the  doping  level  in  the  n-type  collector  region.  Also  shown  are  the  sensitivities  computed  from  a  numerical 
perturbation  approach  with  AN  variation  of  10%.  A  good  agreement  is  seen  between  the  sensitivities 
calculated  by  the  direct  and  perturbation  approaches 


Test  Chips 

None. 
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1.5.4  Other  Results 

1.5.4.1  Technology  Transfer/Intellectual  Property 

None  yet. 

1. 5.4.2  Resulting  Publications  and  Presentations 

None  yet. 

1. 5.4.3  Benefits  to  Commercial  Sector 

Parameter  variability  is  an  important  concern  for  nanoscale  devices.  AFRL  and  commercial  sector 
will  benefit  by  having  a  simulator  that  will  accurately  predict  the  effect  of  these  variations  on  the 
performance  of  analog/RF  circuits  in  nanoscale  technologies.  This  will  enable  robust  design  in  the 
presence  of  variability. 

1.5.5  Conclusions 

A  coupled  device/circuit  simulator  has  been  enhanced  to  include  coupled  device  and  circuit  level 
sensitivity  analysis  for  determining  the  impact  of  random  fluctuations  on  the  performance  of  analog/RF 
circuits.  The  goal  was  to  develop  a  simulator  that  is  capable  of  predicting  the  effects  of  process 
fluctuations  on  the  performance  of  critical  analog/RF  circuitry.  Since  the  device  simulator  models 
(numerical  models)  are  predictive,  they  can  be  used  to  evaluate  the  impact  of  technology  on  circuit 
performance.  Coupled  device/circuit  simulation  will  be  essential  for  the  analysis  of  random  fluctuations  in 
analog/RF  ICs  fabricated  in  nanoscale  processes.  Parameter  variability  is  an  important  concern  for 
nanoscale  devices.  AFRL  and  the  commercial  sector  will  benefit  by  having  a  simulator  that  will 
accurately  predict  the  effect  of  these  variations  on  the  performance  of  analog/RF  circuits  in  nanoscale 
technologies.  This  will  enable  robust  design  in  the  presence  of  variability. 

1.5.6  Students  Receiving  AFRL  Fellowship 

Adam  Heiber,  AFRL  graduate  fellowship. 
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1.6  PROJECT  6:  STOCHASTIC  AND  PASSIVE  A/D  TECHNIQUES  OF 
SUBMICRON  CMOS 

1.6.1  Abstract 

Stochastic  and  passive  conversion  techniques  are  proposed  as  high  performance  and  low  power 
analog  to  digital  data  converter  architectures.  Using  knowledge  of  the  random  nature  of  device  mismatch 
it  is  possible  to  employ  stochastic  techniques  allowing  the  use  of  many  smaller  and  less  accurate 
components  to  save  power  and  area  while  maintaining  accuracy.  Additional  benefits  include  high 
scalability  and  high  yield/robustness  with  process  variation,  voltage,  temperature,  and  radiation.  A  test 
chip  was  implemented  with  digital  cell  comparators  to  observe  their  performance  and  to  lead  to  a  design 
that  is  more  synthesizable. 

1.6.2  Project  Description 

This  proposed  research  is  to  explore  stochastic  and  passive  conversion  techniques  to  achieve  high 
performance  and  low  power  analog  to  digital  data  converters.  We  believe  such  techniques  will  lead  to 
high  tolerance  to  submicron  process  variation/mismatch,  and  optimized/reduced  power  consumption.  The 
continually  decreasing  feature  size  and  supply  voltages  of  ICs,  coupled  with  the  increasing  ubiquity  of 
low-power  mobile  applications  makes  research  into  such  low  voltage  submicron  analog  circuit  techniques 
absolutely  crucial  for  successful  integration  of  analog  blocks  into  state-of-the-art  IC  systems.  Our 
research  aims  to  not  only  achieve  these  goals  of  low  power  and  low  voltage,  but  also  to  relax  the 
increasingly  stringent  challenges  of  technology  scaling  from  one  process  to  another,  yielding  devices 
robust  to  process,  voltage  and  temperature  variations. 

We  seek  to  address  two  very  important  problems  (among  others)  in  analog  design — 
device/element  mismatch  and  noise  (small  signal  swing),  both  of  which  present  considerable  challenges 
in  the  scaling  of  the  analog  in  mixed  analog/digital  systems.  We  believe  that  one  promising  solution  to 
this  limitation  is  to  use  the  stochastic  nature  of  device  mismatch  to  our  advantage.  One  example  of  this 
application  would  be  to  consider  a  moderately  accurate  (8-10  bit)  subranging  flash  ADC.  Depending  on 
the  speed,  noise,  and  other  factors,  the  analog  comparators  in  the  flash  ADC  will  generally  have  high 
power  requirements  and  occupy  a  large  area  footprint  in  order  to  achieve  this  level  of  accuracy.  This  is 
especially  critical  if  the  input  voltage  signal  swing  is  reduced,  because  this  directly  translates  to  an  added 
burden  on  the  accuracy  of  the  comparators.  Moreover,  the  comparators  in  the  subrange  require  pre¬ 
amplifiers  for  output-offset-storage  as  used  extensively  in  [53].  The  low  intrinsic  device  gain  brought 
about  by  device  scaling  requires  that  either  many  amplifier  stages  are  cascaded  (which  leads  to  stability 
problems)  or  cascoded  with  gain-boosting  (which  leads  to  small  output  swing).  It  may  be  possible  to 
replace  these  large  comparators  with  many/redundant,  small,  less  accurate  comparators.  Instead  of 
suppressing  comparator  offset,  it  may  be  possible  to  use  the  random  nature  of  the  offset  as  part  of  a 
stochastic  ADC. 

Flash  ADCs  typically  use  some  sort  of  reference  ladder  to  generate  the  comparator  trip  points  that 
correspond  to  each  digital  code.  First  proposed  in  [54],  a  stochastic  ADC  uses  device  mismatch  to 
generate  these  trip-points.  Consider  a  large  array  of  identically  drawn  comparators,  each  with  a  random 
input-referred  offset.  Individual  offsets  are  unknown,  as  they  are  random,  but  the  overall  offset 
distribution  can  be  defined  by  its  probability  density  function  (PDF).  If  all  of  these  comparators  are 
connected  in  parallel,  i.e.  their  inputs  are  all  connected  as  in  Fig.  27(b),  and  a  linear  ramp  is  applied  at  the 
input,  a  plot  of  the  number  of  comparators  that  evaluate  high  against  the  input  will  follow  the  cumulative 
density  function  (CDF)  which  is  merely  the  integral  of  the  PDF  as  depicted  in  Fig.  27.  If  comparator 
offset  follows  a  Gaussian  distribution  or  other  distribution  with  a  near  linear  CDF,  then  the  CDF  can  be 
used  as  the  transfer  function  without  calibration. 

Pipelined  A/D  converters  are  an  attractive  option  for  medium  speed  (20-200MHz)  and  medium- 
to-high  resolution  (8-14  bits)  conversion.  Satisfying  the  opamp  gain,  speed,  and  slew  requirements  in 
conventional  pipelined  architectures  is  often  a  primary  design  challenge.  We  seek  to  explore  alternative 
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techniques  to  pipelined  A/D  conversion  which  can  alleviate  these  design  restraints.  Our  current  research 
is  a  hybrid  structure  that  utilizes  the  techniques  of  comparator  based  switched  capacitor  and  correlated 
level  shifting  in  a  new  way,  and  which  employs  a  new  type  of  continuous-time  comparator,  the  inverse 
class  AB  (iCAB). 


(a) 


(C) 


Figure  27.  a)  Probability  density  function  of  comparator  offset  in  terms  of  standard  deviation,  o,  assuming 
Gaussian  distribution,  b)  1024  comparators  connected  in  parallel  with  a  single,  fixed  reference  and  a  ramp 
input  c)  Output  of  1024  comparators  with  ramp  input  in  terms  of  o 


1.6.3  Research  Results  and  Discussion 

Test  Chips 

A  test  chip  was  fabricated  in  Jazz  0.18pm  BiCMOS  (Fig.  28)  with  a  total  area  of  5.76  mm2.  It  can  be 
seen  in  Fig.  29  (a)  that  increasing  the  number  of  active  comparators  yields  a  measured  increase  in  ENOB 
calculated  from  SNDR.  This  indicates  that  linearity  continues  to  increase  as  a  function  of  the  number  of 
comparators;  however,  note  that  enabling  more  than  1152  comparators  for  Gaussian  nonlinearity 
reduction  (Fig.  30)  does  not  yield  any  additional  observed  improvement. 

Since  area  and  power  scale  linearly  with  the  number  of  comparators  (Fig.  29  (b)),  it  was  chosen  to 
enable  only  1152  comparators  to  demonstrate  the  concept  and  obtain  additional  measurement  results; 
thereby  reducing  the  effective  active  area  to  0.43  mm2. 
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Figure  28.  Die  photo.  Die  dimensions  are  2.4mm  by  2.4mm.  Also,  layout  screen  capture  showing  detail  of 
functional  blocks.  Note  comparator  size  in  relationship  to  full  adders 


Figure  29.  a)  Measured  ENOB  plotted  against  number  of  comparators  activated.  The  dashed  line  uses  the 
Gaussian  nonlinearity  reduction  technique  described  in  this  report.  For  comparison,  the  solid  line  is 
measured  ENOB  from  a  single  group  of  comparators  using  a  generated  lookup  table,  b)  Area  and  power  scale 

linearly  with  the  number  of  active  comparators 


Figure  30.  Transfer  functions  for  two  groups  of  parallel  comparators  with  fixed  references  of -a  and  +o  for 
groups  A  and  B,  respectively.  The  sum  of  these  groups  has  higher  linearity  over  the  range  -o  and  +o 
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Since  these  digital  cell  comparators  (Fig.  31)  are  made  up  of  minimum  sized  transistors,  the  standard 
deviation  (a)  of  comparator  offset  is  expected  to  be  quite  large.  In  fact,  measurement  shows  that  for  our 
test  setup  with  supply  voltage  of  900  mV,  a  -  140  mV.  Because  the  signal  range  is  -a  to  +c,  the  resulting 
signal  range  is  280  mV.  With  comparator  offsets  of  this  magnitude,  it  would  be  difficult  to  obtain  any 
resolution  with  conventional  circuit  techniques.  The  active  comparators  are  divided  into  two  groups  of 
576  comparators  each  and  given  fixed  differential  references  of -a  and  +o.  A  1  MHz  sine  input  is  applied 
and  ENOB  calculated  from  SNDR  is  above  4.9b  up  to  18MS/s  (Fig.  32).  The  abrupt  drop  in  ENOB 
observed  beyond  18MS/s  is  due  to  ripple-carry  adders  not  having  enough  time  to  resolve,  thus  causing 
gross  digital  errors.  By  designing  a  faster  adder  tree  it  should  be  possible  to  achieve  higher  sampling 
rates. 


Figure  31.  a)  Schematic  of  comparator  with  a  secondary  latch  to  maintain  digital  output  when  comparator  is 
reset.  All  transistor  sizes  are  W/L  =  0.22pm/0.18pm  (the  minimum  allowed  in  this  0.18pm  process),  b)  Layout 
of  comparator  and  secondary  latch.  Minimum  sized  devices  are  used  and  supply  rail  pitch  matches  digital 
library  cells  to  allow  for  fully  automated  synthesis.  Cell  dimensions  are  14.55pm  by  5.84pm 


6  8  10  12  14  16  18  20 

Sampling  Frequency  (MHz) 


Figure  32.  ENOB  plotted  against  sampling  frequency  for  1152  comparators  configured  as  described  by  Fig.  2. 

/in  =  1  MHz  and  VDD  =  900  mV 


The  Gaussian  nonlinearity  reduction  can  be  best  seen  in  Fig.  33.  With  all  1152  comparators  acting  as 
a  single  parallel  group,  i.e.  their  inputs  are  connected  and  references  are  connected,  sweeping  the  input 
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with  a  linear  ramp  reveals  a  transfer  function  that  is  indeed  resembling  a  Gaussian  CDF.  SNDR  of  25.1 
dB  is  achieved  with  a  1  MHz  input  and  sampling  frequency  of  8.192  MHz.  Using  the  exact  same 
comparators  under  the  same  conditions,  but  merely  dividing  them  into  two  groups  with  differing 
references,  an  8.5  dB  improvement  in  SNDR  can  be  seen.  Plots  of  differential  nonlinearity  (DNL)  and 
integral  nonlinearity  (INL)  for  this  test  setup  can  be  seen  in  Fig.  33(c). 


(a) 


Voltage  (a) 


(b) 


Digital  Code 

(C) 

Figure  33.  a)  Measured  transfer  function  of  a  single  group  of  1152  parallel  comparators  (o  ~  140  mV)  and 
FFT  of  1  MHz  sine  input. /s  =  8  MHz.  b)  Measured  transfer  function  of  same  parallel  comparators  as  two 
groups  of  576  with  differing  fixed  references  set  to  -o  and  +<j  for  groups  A  and  B,  respectively.  Also,  FFT  of 
output  from  the  sum  of  groups  A  and  B  of  1  MHz  sine  input. /s  =  8  MHz.  c)  DNL  and  INL  of  summed  output 
from  groups  A  and  B,/s  =  8  MHz 


Power  consumption  for  the  analog  portion  is  182pW.  Digital  power  is  scaled  to  reflect  the  amount 
that  is  related  to  the  number  of  active  comparators.  Digital  power  consumed  by  disabled  portions  of  the 
chip  is  not  included.  Digital  power  is  then  found  to  be  449pW  with  188pW  consumed  by  clock  drivers, 
leaving  261pW  consumed  by  the  pipelined  ripple-carry  adder  tree. 

Since  the  largest  source  of  power  consumption  is  the  digital  adder  tree,  a  lower  power  solution  must 
be  found.  This  led  us  to  the  concept  of  a  Passive  Linear  COunter,  or  PLINCO.  As  seen  in  Fig.  34,  a 
PLINCO  is  merely  a  matrix  of  many  multiplexers,  which  may  be  made  of  MOSFETs  as  passive  pass 


42 


transistors  (Fig.  35).  The  PLINCO  essentially  converts  any  non-thermometer  coded  input  into  a 
monotonic  thermometer  code  at  the  output.  Since  a  “one-hot”  encoded  output  is  actually  more  useful  to 
convert  to  binary,  the  PLINCO  can  be  restructured  (Fig.  36).  It  should  be  noted  that  if  a  PLINCO  has  a 
large  number  of  inputs,  then  the  area  may  become  too  large,  as  it  increases  quadratically.  By  adding  some 
peripheral  logic,  the  PLINCO  can  be  folded  to  save  area.  Figure  37  shows  a  4-input  PLINCO  cell  that 
then  folds  the  output  in  half.  By  using  folded  PLINCO  cells,  the  quadratic  dependence  on  the  number  of 
inputs  is  removed  at  the  expense  of  logic  gates. 


Figure  34.  A  PLINCO  is  made  up  of  a  matrix  of  many  multiplexers,  to  perform  the  function  of  creating  a 
thermometer-coded  output  from  a  non-thermometer  coded  input 


Figure  35.  A  passive  multiplexer  made  up  of  pass  transistors 
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7  6  5  4  3  2  1  0 


Figure  36.  A  varied  version  of  the  PLINCO  structure  to  obtain  a  one-hot-encoded  output  instead  of 

a  thermometer  code 


Figure  37.  A  folded  PLINCO  cell.  The  output  of  the  PLINCO  is  8-wide,  but  only  the  left  half  or  right  half  is 
passed  on.  A  carry  bit  indicates  that  no  information  is  lost 
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Hybrid  Zero-Crossing  Detector  and  Opamp  Pipeline 

This  pipelined  A/D  converter  introduces  two  main  ideas.  First  of  all,  the  design  proposes  a  new 
approach  to  the  technique  of  correlated  level  shifting  (CLS)  [56].  Using  CLS,  the  basic  MDAC  operation 
of  the  pipeline  is  achieved  by  coupling  a  zero-crossing  detector  (ZCD)  [57-59]  based  approximation 
phase  with  an  opamp  fine  settling  phase  (Fig.  38).  The  ZCD  provides  a  coarse  approximation  of  the 
signal  with  higher  power  efficiency  than  can  be  achieved  with  an  opamp,  and  greatly  reduces  the  slew 
requirements  of  the  opamp  second  phase.  The  high  gain  opamp  overcomes  the  typical  drawbacks  of  the 
open-loop  ZCD  architecture  by  providing  high  accuracy  settling.  CLS  decouples  the  MDAC  output  node 
from  the  opamp  output  node,  thereby  minimizing  the  opamp  output  swing  requirements.  The  low  swing 
requirement  allows  the  desired  opamp  gain  to  be  met  by  adding  additional  cascode  transistors  rather  than 
the  traditional  method  of  gain  boosting,  which  saves  the  power  associated  with  the  additional  amplifiers 
typically  required. 

The  second  main  idea  of  this  design  is  to  introduce  a  new  zero-crossing  detector  architecture.  In  the 
literature  thus  far  [59],  no  fully  differential  zero-crossing  detector  designs  have  deviated  from  well  known 
threshold  comparator  architectures.  A  new  type  of  zero-crossing  detector  (Fig.  39),  dubbed  the  inverse 
class-AB  detector  (iCAB),  as  the  name  implies,  operates  analogous  to  doing  the  opposite  of  a  class-AB 
opamp.  Whereas  a  class-AB  opamp  uses  more  current  when  the  inputs  are  far  apart,  and  less  when  they 
are  close  together,  the  iCAB  uses  less  current  when  the  inputs  are  far  apart,  and  more  current  when  the 
inputs  are  close  to  the  decision  threshold.  In  the  context  of  a  threshold  comparator  this  is  useful  because 
the  time  delay  of  the  comparator  decision  is  heavily  dependent  on  the  gm  of  the  ZCD  during  the  narrow 
time  frame  that  the  outputs  of  the  ZCD  are  transitioning.  For  comparator-based  circuits,  the  smaller  the 
time  delay,  the  better.  iCAB  maximizes  power  efficiency  for  a  given  ZCD  time  delay  by  concentrating 
the  power  consumption  around  the  detection  instant  (when  the  inputs  to  the  ZCD  are  equal). 


Figure  38.  Flip-around  MDAC  structure  in  amplify  phase.  During  the  sub-phase  d>l  the  comparator  and 
current  sources  approximate  the  output  voltage.  The  CLS  capacitors  de-correlate  the  MDAC  output  voltage 
from  the  opamp  output  voltage  in  <I>2,  allowing  for  wide-swing,  high  accuracy  settling  without  the  need  for 

gain  boosting  amplifiers 
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Figure  39.  Inverse  class-AB  ZCD  operation  obtained  by  coupling  tail  current  to  output  via  capacitor.  As  V0- 
begins  to  rise,  Itaii  increases,  this  leads  to:  low  current  before  decision,  high  current  during  decision,  and  zero 

current  after  decision 


1.6.4  Other  Results 

1.6.4.1  Technology  Transfer/Intellectual  Property 

The  ultimate  end-goal  of  the  research  of  stochastic  analog-to-digital  conversion  is  a  state-of-the-art, 
high  speed,  moderate  accuracy  (8-10  bits)  ADC  that  is  mostly  (if  not  completely)  synthesizable, 
scaleable,  and  robust  against  process,  voltage,  temperature,  and  radiation.  This  would  have  many  benefits 
in  any  low- voltage  applications  that  require  a  high  speed  ADC. 

1. 6.4.2  Publications  and  Presentations 

None  yet. 

1 .6 .4. 3 Benefits  to  Commercial  Sector 

A  low  power,  high  speed,  high  yield  ADC  is  required  for  a  majority  of  commercial  electronic  circuits. 
This  research  offers  these  qualities  and  also  high  robustness  against  radiation  and  partial  failure  of  the 
comparator  array.  This  will  lead  to  longer  product  lifetimes. 

1.6.5  Conclusions 

In  this  project,  a  test  chip  was  fabricated  in  Jazz  0.18/mi  BiCMOS.  The  test  chip  achieves  over 
4.9b  ENOB  up  to  18MS/s  with  900mV  supply.  With  a  sampling  frequency  of  8.192  MHz  and  1  MHz 
input,  33.6  dB  SNDR  is  achieved  while  consuming  631//W  and  occupying  0.43  mm2.  A  technique  that  is 
unique  to  a  Gaussian  stochastic  converter  is  used  to  improve  linearity  by  an  additional  8.5  dB.  The 
ultimate  end-goal  of  the  research  of  stochastic  analog-to-digital  conversion  is  a  state-of-the-art,  high 
speed,  moderate  accuracy  (8-10  bits)  ADC  that  is  mostly  (if  not  completely)  synthesizable,  scaleable,  and 
robust  against  process,  voltage,  temperature,  and  radiation.  This  would  have  many  benefits  in  any  low- 
voltage  applications  that  require  an  ADC. 

1.6.6  Student(s)  Receiving  AFRL  Fellowships 

Skyler  Weaver,  “Stochastic  ADC  techniques.” 
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1.7 


PROJECT  7:  RECONFIGURABLE  MASTER/SLAVE  LOCKED 
LOW  NOISE  AMPLIFIERS 


1.7.1  Abstract 

The  next  generation  of  military  and  commercial  wireless  systems  must  accommodate  two 
increasingly  important  factors:  the  need  for  extreme  reconfigurability  and  the  requirement  of  low  power 
dissipation.  This  reconfigurability  is  driven  by  the  following:  the  ability  to  cope  with  large  process 
variations/design  uncertainty,  the  need  to  operate  across  multiple  wireless  standards,  and  operation  in 
various  SNR/interference  scenarios.  In  this  project,  we  will  develop  a  frequency  locked  LNA  that  can 
accommodate  process/design  uncertainty  as  well  as  allow  operation  over  a  wide  range  of  LNA  bias 
points.  This  architecture  will  allow  an  extremely  wide  front-end  tuning  range,  providing  the  flexibility  of 
a  wideband  front-end  with  the  power  and  linearity  benefits  of  a  narrowband  receiver.  While  retaining 
high  performance,  we  are  extending  the  linearity  of  the  LNA  in-band  to  better  resist  strong  interferers 
without  increasing  the  supply  rails. 


1.7.2  Project  Description 

The  ongoing  transition  to  nanoscale  technology  nodes  will  enable  incredible  levels  of 
functionality  and  system  performance.  However,  these  processes  also  introduce  significant  technical 
challenges  for  future  wireless  system 
architectures. 

The  current  generation  of  mobile 
telephones  contains  up  to  eight  receivers, 
and  future  generations  promise  to  integrate 
even  more  functionality.  Although  active 
devices  scale  with  increasing  lithographic 
resolution,  the  size  of  on-chip 
electromagnetic  components  is 
fundamentally  and  theoretically  limited  and 
does  not  scale.  Figure  40  shows  one  of  our 
recent  2  GHz  transceivers  [60].  A 
significant  percentage  of  the  (lx2)mm2 
CMOS  die  area  is  dedicated  to  the  on-chip 
LNA  inductors.  The  goal  of  this  project  is 
to  accurately  cover  the  maximum  number  of  frequency  bands  with  a  single  LNA,  minimizing  the  number 
and  area  of  on-chip  electromagnetic  tuning  elements. 

To  continue  adding  functionality  and  increasing  numbers  of  wireless  standards  accommodated  in 
a  single  chip,  the  number  and  size  of  on-chip  inductors  must  be  capped.  The  solution  we  propose  is  to 
develop  a  class  of  reconfigurable  multi-band  low-noise  amplifiers  (LNAs)  that  are  locked  to  a  frequency 
reference,  allowing  precise  tuning  to  multiple  frequency  bands.  The  LNA  will  be  optimized  to  tune  over  a 
wide  bandwidth  to  accommodate  multiple  wireless  standards.  To  counteract  process  variations,  a  novel 
phase  detection  scheme  will  be  used  to  accurately  detect  the  front  end  frequency  and  lock  it  to  the  local 
oscillator  (LO)  PLL.  Since  the  PLL  is  in  turn  locked  to  the  system  frequency  reference,  the  LNA  tuning 
will  be  accurate  over  process,  bias  points,  and  design  uncertainty. 

Our  proposed  system  will  allow  the  accurate  tuning  of  one  narrowband  LNA  over  a  wide  range  of 
wireless  standards,  ranging  from  approximately  800  MHz  to  2.4  GHz.  This  will  be  accomplished  through 
in-situ  calibration  of  the  LNA.  Since  the  chip  will  already  contain  a  band-switching  PLL,  this  block  will 
be  re-used  to  tune  the  LNA  center  frequency.  Integrated  LNAs  typically  require  two  separate  resonant 
tanks:  a  narrowband  input  match  and  narrowband  tuned  load.  In  [61],  Liscidini  showed  that  an  input 
match  defined  using  positive  feedback  taken  from  the  output  tank  will  allow  simultaneous  tuning  of  the 


Figure  40.  2GHz  CMOS  Transceiver.  A  significant 
amount  of  the  die  area  is  sacrificed  to  the  on-chip  LNA 
inductors  [60] 
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input/output  tanks.  However,  this  implementation  suffered  from  a  relatively  large  discrepancy  between 
the  designed  and  measured  center  frequencies  due  to  imperfect  modeling.  By  frequency  locking  the  LNA 
to  the  RF  carrier  frequency  during  a  calibration  phase,  this  uncertainty  in  tuning  frequency  will  be 
eliminated.  Figure  41  below  shows  our  high-level  architecture  for  calibrating  the  LNA  frequency. 


Figure  41.  Architecture  of  Reconfigurable  LNA 


Tuning  low-frequency  baseband  filters  and  amplifiers  with  master-slave  topologies  has  been  extensively 
studied  in  the  literature  as  a  way  to  overcome  process,  supply,  and  temperature  (PVT)  variations  [62].  By 
frequency  locking  a  replica  filter  to  the  system  frequency  reference,  a  precise  amplifier/filter  center 
frequency/bandwidth  can  be  set.  To  our  knowledge,  such  an  architecture  has  not  been  demonstrated  for 
RF  front-ends.  In  addition  to  PVT  stability,  a  frequency  locked  narrowband  LNA  will  allow  a  wide 
variation  in  current  and  bias  point  settings.  This  opens  the  potential  for  dynamically  trading  off  power 
consumption  for  gain,  noise,  and  linearity.  Traditionally,  RF  front-end  circuitry  is  designed  for  worst-case 
communication  environments.  For  example  stringent  noise-figure  specifications  are  determined  by  the 
minimum-detectable  signal  levels  that  are  only  occasionally  seen  in  practice.  Linearity  requirements  are 
set  by  the  possibility  of  high  interferer  conditions. 

One  of  the  key  contributions  of  this  work  is  our  center  frequency  detection  technique.  Instead  of 
injecting  a  test  signal  and  tuning  the  amplifier  tank  to  achieve  a  maximum  transfer  function  magnitude , 
we  are  detecting  the  transfer  function  phase.  This  yields  a  control  loop  error  signal  that  can  be  driven  to 
zero  through  a  feedback  loop,  providing  accurate  frequency  definition  regardless  of  the  tank  quality 
factor. 
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1.7.3 


Research  Results  and  Discussion 

At  the  beginning  of  the  project,  we  developed  the  following  topology  for  an  auto-tuning  LNA  SOC. 

Gi  G2 


Figure  42.  Simplified  frequency-locked  LNA  architecture 

In  Figure  42,  Gl  is  a  transconductance  and  G2  is  a  50  ohm  output  buffer.  Responding  to  requests  and 
suggestions  from  the  CDADIC  members,  we  reduced  the  die  area  by  removing  the  output  inductor.  To 
further  improve  the  auto-tuning  system  we  moved  the  phase  sampling  point  to  before  G2  and  put  a  90 
degree  phase  shift  between  the  frequency  reference  and  the  phase  detector.  This  streamlines  the 
implementation  -  using  a  QVCO  that  would  already  be  in  the  transceiver  design. 

Gi  G2 


Figure  43.  Modified  Auto-Tuning  Amplifier  SOC 


In  order  to  improve  the  noise  figure  of  the  LNA,  we  used  a  gm-boosting  technique  [63,64].  Contrasting 
the  circuits  presented  in  the  literature,  we  found  that  by  choosing  a  differential  structure  with  an  auto- 
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transformer  as  the  input  to  the  amplifier,  we  could  elegantly  combine  a  lossless  impedance  transformation 
to  multiply  the  gain  with  the  gm-boosting  circuitry  (Figure  43). 

Rather  than  switching  capacitors  to  get  extra  tuning  range  and  crippling  the  LNA’s  performance, 
we  chose  to  switch  out  windings  of  the  source  and  load  inductors,  finding  a  compact  way  to  combine  this 
concept  with  the  gm-boosting  and  the  auto-transformer.  Unlike  other  examples  of  switched  winding 
circuits  in  the  literature,  we  realized  that  by  only  grounding  one  switch  at  a  time,  we  prevented  large 
currents  from  being  magnetically  induced  in  the  grounded  section  of  the  coil.  This  eliminated  a  major  loss 
mechanism  and  increased  tank  Q  (a  single  ended  LNA  with  some  similar  concepts  was  published  recently 
[65]). 

In  addition,  by  stepping  down  the  source  of  the  common-gate  stage  to  an  even  lower  impedance 
than  the  50  ohm  front  end,  we  had  lower  voltage  swings  and  therefore  better  linearity  for  a  given  input 
power.  In  simulations,  the  IIP3  reached  +8dBm  with  a  1.2  V  rail.  Also,  the  capacitance  of  the  front  end 
transistors  was  effectively  stepped  down  by  the  turns  ratio  of  the  auto-transformer.  This  allowed  for  large 
devices  to  be  used  and  consequently  a  lower  current  would  make  a  good  front  end  impedance  match. 

Since  the  turns  ratio  is  greatest  in  the  highest  frequency  bands,  unusually  large  (768  m/minimum) 
devices  were  usable  at  2.6  GHz. 


Test  Chips 

The  final  circuit  simulated  with  a  300%  tuning  range,  operating  in  4  switched  bands  with  analog 
(MOS  varactor)  tuning  within  each  band.  A  simplified,  3 -band  version  of  the  circuit  with  layout  is 
presented  is  shown  in  Figure  44. 


Figure  44.  Simplified  gm-boosted  front  end  with 
3  bands  of  inductor  winding  switching 


The  chip  was  fabricated  in  a  0.1 3um  CMOS  process.  However,  as  the  switch  quality  improves  with  the 
shrinking  process  nodes  and  the  parasitic  capacitance  decreases,  the  tuning  range  becomes  higher,  the 
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circuit  Q  increases,  the  gain  increases,  the  noise  figure  goes  down.  In  addition,  as  the  channel  length 
decreases,  the  power  requirements  are  reduced  and  the  tuning  range  is  further  augmented.  Therefore  in 
advanced  CMOS  processes  we  expect  improved  performance. 

A  die  photo  of  the  chip  can  be  seen  in  Figure  45.  The  die  area  used  is  300um  x  lOOOum,  not 
including  pads  and  coplanar  waveguide  input  and  output  structures. 


Figure  45.  Chip  Die  Photo 

Several  things  were  ascertained  through  simulation  about  the  auto-tuning  SOC  which  would  be  of 
use  to  anyone  who  wishes  to  create  such  a  chip.  First,  in  simulation,  the  tuning  loop  was  converging  in 
less  than  100  ns.  However,  since  accurate  phase  reconstruction  of  a  wide  swing,  non-linear  circuit 
(Gilbert  Cell)  is  required,  finding  the  frequency-tuning  offset  from  the  reference  frequency  is  an  issue. 

We  found  that  it  was  useful  to  run  the  simulation  with  a  series  of  convergence  parameters,  then 
extrapolate  to  0  error  -  this  saved  time  over  running  a  single  simulation  with  a  more  stringent 
convergence  criterion  and  gave  some  idea  about  the  errors  involved. 

1.7.4  Other  Results 

1. 7.4.1  Technology  Transfer/Intellectual  Property 

Patent:  ”  Auto-Tuning  Amplifier”,  U.S.  Serial  No.  12/020,805.  (patent  filed). 

Our  phase-calibrated  tuning  method  could  be  of  immediate  interest  to  our  military  and  commercial 
members.  This  technology  can  be  directly  applied  to  a  host  of  RF  communications  systems  that  currently 
require  multiple  RF  front-ends  or  extremely  wide  tuning  ranges. 

1. 7.4.2  Publications  and  Presentations 

Contingent  upon  a  successful  outcome,  our  tentative  publication  plan  is  as  follows: 

A  conference  publication  presenting  circuit  techniques  and  system  performance 
A  journal  publication  detailing  the  analysis  and  limitations  of  our  phase-calibrated  tuning  control  loop 
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1. 7.4.3  Benefits  to  Commercial  Sector 

Industry  is  currently  facing  a  crisis  with  the  area  consumed  by  on-chip  inductors.  This  problem  is 
exacerbated  by  the  following  trends:  the  push  towards  smaller  feature  sizes  and  the  need  for  multi-band 
single-chip  transceivers.  As  systems  are  migrated  to  subsequent  technology  nodes,  the  effective  cost  of 
on-chip  inductors  becomes  extremely  high.  For  example,  in  a  65nm  process,  a  single  300pm  x  300pm 
inductor  consumes  the  silicon  area  of  approximately  1  million  logic  or  memory  transistors.  A  four-band 
receiver  designed  with  parallel  front-ends  may  require  8  or  more  inductors,  consuming  a  prohibitively 
large  amount  of  silicon  area.  These  same  problems  are  faced  in  the  military  sector,  especially  with  the 
need  to  have  ubiquitous  communication  devices  that  span  a  broad  range  of  military/civilian  wireless 
standards. 

In  this  project,  we  have  introduced  new  techniques  to  allow  the  reuse  of  front-end  blocks  across 
multiple  bands  while  maintaining  frequency  and  impedance  accuracies.  This  will  help  offset  the  effects  of 
process  variation  and  design  uncertainty,  as  well  as  significantly  reduce  the  amount  of  on-chip  area 
consumed  by  inductors.  This  technique  is  general  enough  to  be  used  by  many  types  of  transceiver 
architectures,  allowing  universal  application  in  many  commercial  and  DoD  sectors. 

1.7.5  Conclusions 

The  next  generation  of  wireless  systems  must  accommodate  two  increasingly  important  factors: 
the  need  for  extreme  reconfigurability  and  the  requirement  of  low  power  dissipation.  This 
reconfigurability  is  driven  by  the  following:  the  ability  to  cope  with  large  process  variations/design 
uncertainty,  the  need  to  operate  across  multiple  wireless  standards,  and  operation  in  various 
SNR/interference  scenarios.  This  project  developed  a  frequency  locked  LNA  that  can  accommodate 
process/design  uncertainty  as  well  as  allow  operation  over  a  wide  range  of  LNA  bias  points.  Industry  is 
currently  facing  a  crisis  with  the  area  consumed  by  on-chip  inductors.  This  problem  is  exacerbated  by  the 
following  trends:  the  push  towards  smaller  feature  sizes  and  the  need  for  multi -band  single-chip 
transceivers.  These  same  problems  are  faced  in  the  military  sector,  especially  with  the  need  to  have 
ubiquitous  communication  devices  that  span  a  broad  range  of  military/civilian  wireless  standards  along 
with  fast  callibration  over  a  wide  range  of  conditions.  The  technique  developed  here  is  general  enough  to 
be  used  by  many  types  of  transceiver  architectures,  so  it  should  find  universal  application  in  many 
commercial  and  military  sector. 

1.7.6  Students  Receiving  AFRL  Fellowships 

Fedja  Karalic  (AFRL  Undergraduate  Fellowship)  -  Fedja  is  currently  performing  multiple  inductor 
layouts  and  running  EM  simulations.  He  will  be  helping  with  this  project  until  June  2008. 

Will  Biederman  (AFRL  Undergraduate  Fellowship)  -  During  the  summer  of  2008,  Will  Biederman 
assisted  testboard  layout  and  fabrication. 

Julie  Hu  (AFRL  Summer  Fellowship)  -  In  addition  to  continuing  work  on  her  frequency  synthesizer  as 
well  as  assist  in  the  testing  and  characterization  of  the  frequency  locked  LNA. 
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1.8 


PROJECT  8:  HIGHLY  CONFIGURABLE  AND  ROBUST  DATA  CONVERTERS 


1.8.1  Abstract 

A  multi-cell  delta-sigma  analog-digital  converter  was  developed  in  this  project,  with  a  digital 
calibration  and  digital  control  of  the  overall  resolution  and  power  requirement.  The  resulting  structure  can 
accommodate  large  process  variations,  as  called  for  in  Research  Task  Area  1,  and  also  allows  a 
reconfiguration  of  the  performance  using  a  digital  control  input  signal,  as  specified  by  Research  Task 
Area  3. 

1.8.2  Project  Description 

As  a  preliminary  to  the  main  multi-cell  design  project,  two  noise-coupled  (ring-coupled  and  self- 
coupled)  four-cell  ADCs  have  been  designed,  fabricated  and  tested.  They  performed  according  to 
specifications,  verifying  the  underlying  principles  of  noise-coupled  multi-cell  delta-sigma  converters.  A 
more  powerful  double-sampling  multi-cell  converter  was  then  designed,  with  optimum  layout.  A  novel 
double-sampling  circuit  was  found  and  simulated  indicating  improved  performance.  The  details  of  these 
projects  are  described  below. 

A  delta-sigma  modulator  with  low  distortion  architecture  significantly  reduces  the  signal  swing 
and  nonlinear  signal  distortion.  However,  the  linearity  will  be  limited  by  harmonic  spurs  and  idle  tones 
generated  in  the  loop.  Usually,  an  external  dither  is  injected  into  the  loop  to  prevent  periodic  tones.  This, 
however,  requires  additional  hardware  for  dither  generation,  and  it  also  reduces  the  dynamic  range  (DR) 
of  the  modulator.  Using  noise  coupling  [66-70],  the  quantization  noise  itself  is  used  as  dither.  Two 
versions,  ring-coupled  and  self-coupled  techniques,  were  introduced.  For  the  ring-coupled  technique,  the 
injected  noise  comes  from  another  modulator,  while  for  self-coupled  technique;  the  injected  noise  is 
generated  by  the  modulator  itself.  The  injected  noise  also  enhances  the  noise-shaping  performance  of  the 
overall  converter.  The  proposed  four-cell  ring-coupled  modulator  is  shown  in  Fig.46.  A  prototype  chip 
was  fabricated  in  0.1 8um  2P4M  CMOS  process.  With  a  single-tone  input  swept  from  10  kHz  to  1  MHz, 
and  sampled  at  a  60  MHz  rate,  it  provides  86  dB  peak  SNR  and  SNDR,  88  dB  DR,  and  -102.4  dB  THD. 
The  excellent  linearity  is  verified  in  Fig.27a  for  a  -0.92  dBFS,  21.06  kHz  input,  giving  a  SNDR=86dB.  A 
two-tone  test  gave  a  -94.4  dB  IMD  (Fig.  47b).  The  die  micrograph  and  summary  of  measured 
performance  are  shown  in  Fig.  48.  The  proposed  four-cell  self-coupled  modulator  is  shown  in  Fig.49  [71- 
72].  The  prototype  chip  was  fabricated  in  0.1 8um  2P4M  CMOS  process.  With  a  10  kHz  to  1  MHz  input 
signal  sampled  with  a  60  MHz  clock  rate,  it  shows  over  86  dB  peak  SNR  and  SNDR,  88  dB  DR,  and  -98 
dB  THD  in  the  1.9  MHz  signal  band.  The  measured  spectrum  with  a  -1.13  dBFS,  100  kHz  input  is  shown 
in  Fig.  50.  The  die  micrograph  and  the  summary  of  measured  performance  are  illustrated  in  Fig.5 1 . 


Figure  46.  Block  diagram  of  the  ring-coupled  4-cell  delta-sigma  ADC 
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Figure  47.  Measured  spectra  of  the  ring-coupled  4-cell  delta-sigma  ADC 
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Figure  48.  Die  photo  and  summary  of  measured  performances  for  the  ring-coupled 

4-cell  delta  sigma  ADC 


Figure  49.  Block  diagram  of  the  self-coupled  4-cell  delta  sigma  ADC 
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Figure  50.  Measured  spectrum  of  the  self-coupled  4-cell  delta  sigma  ADC  at  peak  SNDR 
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Figure  51.  Die  photo  and  summary  of  measured  performances  for  self-coupled 

4-cell  delta  sigma  ADC 


1.8.3  Research  Results  and  Discussion 

The  main  purpose  of  the  project  was  the  development  of  the  multi-cell  delta-sigma  modulator 
shown  as  Fig.  52.  In  the  structure,  each  cell  consists  of  a  third-order  low  distortion  modulator,  with  a  15- 
level  quantizer  and  optimized  NTF.  Fig.  53  shows  the  block  diagram  of  one  cell.  Eight  such  cells  will  be 
used  in  the  delta-sigma  ADC,  providing  robust  performance  and  easy  programmability.  Since  doubling 
the  number  of  cells  improves  the  SQNR  by  3  dB,  the  overall  modulator  performance  achieves  a  9  dB 
SQNR  improvement  over  that  of  a  single  cell.  The  PSD  simulation  results  for  1-cell  and  8-cell  are  shown 
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in  Fig.  54.  Simulation  results  of  the  proposed  modulator  are  summarized  in  Table  5  with  various  numbers 
of  activated  cells.  It  shows  negligible  performance  degradation  with  few  disabled  cells  for  MATLAB 
simulation. 

To  double  the  bandwidth  of  modulators,  time  interleaving  (TI)  or  double  sampling  (DS) 
architectures  is  commonly  used.  We  realized  a  TI  ADC  under  another  CDADIC  project.  In  this  research, 
we  are  applying  double  sampling.  While  conventional  double-sampling  delta-sigma  modulators  suffer 
from  noise  folding  induced  at  DAC  feedback  branch,  we  found  a  way  to  avoid  noise  folding  with  two 
novel  double-sampling  schemes  in  delta-sigma  ADCs.  Fig.55  shows  a  single-ended  version  of  the  first 
proposed  double-sampling  switched-capacitor  circuit.  Fig.  56  illustrates  the  other  double-sampling 
branch  [73],  [74].  By  using  the  1+z'1  and  1-z"1  factors  together,  noise  folding  can  be  eliminated  without 
stability  problems.  Fig.57  shows  the  power  spectra  resulting  from  this  scheme  compared  with 
Senderowicz’s  double-sampling  DAC  [75]. 

In  the  low-distortion  architecture,  the  delay  in  the  path  from  the  quantizer  output  to  the  DAC 
output  is  very  critical.  The  data-weighted-averaging  (DWA)  circuit,  added  to  filter  DAC  nonlinearity 
errors,  generates  additional  delay  in  the  DAC  feedback  path.  To  mitigate  this  problem,  the  DWA  circuit 
delay  has  to  been  optimized.  This  can  be  done  by  moving  the  DWA  structure  (a  four-stage  logarithmic 
shifter)  to  the  quantizer  input  [76].  This  will,  however,  increase  the  noise  floor  and  degrade  the  SQNR. 
Since  the  delay  reduction  required  here  is  not  too  large,  we  can  just  move  the  first  stage  switching  block 
to  the  quantizer  input.  The  increased  harmonics  can  be  suppressed  by  alternating  the  MSBs  and  LSBs  as 
in  the  Segmented  DWA  (SeDWA)  described  in  [77].  The  block  diagram  of  the  new  scheme  is  shown  in 
Fig.  58. 

Fig.  59  shows  the  one-cell  SPECTRE  simulation  result  of  the  double-sampling  digital-to-analog 
converter  of  Fig.  55.  The  SNDR  is  68.1  dB  with  an  OSR  of  8.  The  micrograph  of  the  fabricated  IC  for 
this  project  is  shown  in  Fig.  60.  Because  of  harmonic  distortion  observed  in  the  simulation  results,  a  2- 
cell  was  also  fabricated  to  see  its  functionality.  The  total  area  of  the  1-cell  layout  shown  in  Fig.  15  is 
0.96X0.37  mm2  and  the  estimated  area  of  the  8-cell  layout  is  0.96X2.96  mm2.  The  performance  predicted 
by  SPECTRE  simulations  is  summarized  in  Table  6.  A  single-cell  delta-sigma  ADC  which  has  small  size, 
high  SNR,  and  low  power  consumption  was  also  designed  for  multi-cell  applications.  We  are  testing  the 
fabricated  IC. 


Figure  52.  The  block  diagram  of  the  proposed  eight-cell  delta-sigma  ADC 
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Figure  53.  Single-cell  architecture 


I  SNDR8-ce„  =  79  0  dB-  SNDR1-cel.  =  69  9  dB  1 


Figure  54.  Power  spectral  density  for  1-cell  and  8-cell  ADCs 


Table  5.  SQNR  vs.  number  of  activated  cells 


Number  of  activated  cells 

SQNR  [dB] 

8 

79.0 

7 

78.4 

6 

77.7 

5 

76.9 

4 

75.9 

3 

74.7 

2 

73.0 

1 

69.9 
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Figure  55.  A  switched-capacitor  circuit  for  double  sampling 


Figure  56.  A  switched-capacitor  circuit  for  double  sampling  [73] 


Figure  57.  Power  spectral  density  of  second-order  ADCs  [73] 
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Critical  path  (3-gate  delay) 
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Figure  58.  Critical  path  delay  reduction  by  moving  the  first  shifter  stage 


SNDRspectre  =  68-1  dB,N=4096 


Figure  59.  SPECTRE  simulation  result  with  DAC  of  Fig.  10 
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Test  Chips 
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Figure  60.  Micrograph  of  (a)  one  cell  and  (b)  whole  chip 


Table  6.  Summary  of  performance 


Parameter 

Specification 

Bandwidth 

20MHz 

Clock  Frequency 

160MHz 

OSR 

8 

SNDR  1-cell  (estimated  8-cell) 
without  thermal  noise 

68.1  dB  (77.1  dB) 

Number  of  Cells 

8 

Estimated  Power  Consumption  of  1  Cell  (8-cell) 

16.4  mW  (131.2  mW) 

Technology 

0.1 8um,  2poly-4metal 

1.8.4  Other  Results 

1. 8.4.1  Technology  Transfer/Intellectual  Property 
None 
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1. 8.4.2  Publications  and  Presentations 

K.  Lee,  F.  Maloberti,  G.  C.  Temes,  “Noise-Coupled  Multi-Cell  AE  ADCs,”  Proc.  IEEE  ISC  AS,  vol.  1, 
pp.  249-252,  May  2007. 

K.  Lee,  J.  Chae,  and  G.  C.  Temes,  "Efficient  floating  double-sampling  integrator  for  AE 
ADCs,”  Electron.  Lett.  vol.  43,  no  25,  pp.1413  -  1414,  Dec.  2007. 

K.  Lee,  J.  Chae,  and  G.  C.  Temes,  “Efficient  fully-floating  double-sampling  integrator  for  AE  ADCs,” 
Proc.  IEEE ISCAS ,  pp.  1440  -  1443,  May  2008. 


1. 8.4.3  Benefits  to  Commercial  Sector 

Flexible  and  robust  analog-to-digital  converters  are  also  often  needed  in  commercial  applications, 
for  example  in  multi-standard  communication  systems  and  devices.  Low  power  and  high  speed  analog-to- 
digital  converters  can  be  achieved  using  the  design  shown  here. 


1.8.5  Conclusions 

Digitally  programmable  multi-cell,  delta-sigma  ADCs  were  developed  in  this  research  project. 
They  are  highly  configurable  and  robust  data  converters,  where  the  trade-off  between  resolution,  speed 
and  power  dissipation  can  be  altered  to  use  the  same  device  under  different  conditions  and  in  different 
applications.  The  concept  of  noise-coupled,  multi-bit  delta-sigma  ADCs  and  the  improved  double¬ 
sampling  scheme  was  invented  by  the  researchers  in  this  group.  The  digitally  programmable  multi-cell 
ADC  is  a  flexible  and  robust  device,  well  suited  for  implementation  in  nanoscale  technology,  and  for 
operation  in  hostile  environments  of  value  for  military  use.  Flexible  and  robust  analog-to-digital 
converters  are  also  often  needed  in  commercial  applications,  such  as  in  multi-standard  communication 
systems  and  devices. 

1.8.6  Students  Receiving  AFRL  Fellowships 

None 
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1.9  PROJECT  9  (Addendum  Project):  A  LOW-POWER,  LOW  JITTER 

FRACTIONAL-N  FREQUENCY  SYNTHESIZER  WITH  WIDE-TUNING  BAW- 
STABILIZED  VCO 

1.9.1  Abstract 

In  this  project,  we  investigated  and  prototyped  the  first  bulk-acoustic  wave  resonator  (BAW)  based 
integer-N  phase  locked  loop  (PLL),  demonstrating  that  low  power  and  low  phase  noise  performance  can 
be  simultaneously  achieved  using  extremely  high  Q  (>2000)  BAW  resonators.  However,  such  high  Q 
limits  the  tuning  range  of  the  BAW-based  PLL.  We  are  addressing  this  issue  by  proposing  a  multi-BAW- 
resonator  tank  strategy,  in  combination  with  negative  capacitance  techniques,  to  enhance  the  tuning  range 
of  the  VCO  without  fundamentally  sacrificing  the  power  and  phase  noise  performance.  We  analyzed, 
designed,  and  fabricated  an  inductor- free,  low  power,  low  phase-noise/jitter,  wide  tuning  range  BAW- 
based  fractional-N  frequency  synthesizer. 

1.9.2  Project  Description 

In  the  first  six  months  of  this  project,  we  finished  prototyping  and  testing  the  integer-N  based  BAW 
PLL.  The  PLL  output  frequency  inherits  the  accuracy  and  close-in  purity  of  the  reference  while  providing 
over  30dB  lower  phase  noise  at  high  frequency  offsets  than  traditional  VCOs  for  a  given  power 
dissipation.  The  entire  PLL  operates  from  a  IV  supply  with  750pW  power  dissipation  and  demonstrates 
0.6ps  of  jitter  (lOkHz-lOMHz)  and  phase  noise  of  -82dBc/Hz  and  -138dBc/Hz  at  1kHz  and  1MHz  offsets, 
respectively,  at  center  frequency  of  1.575GHz.  The  tuning  range  of  the  PLL  is  1.3MHz.  These  results  are 
consistent  with  our  theoretical  study  and  prediction.  The  design  was  fabricated  in  0.13pm  CMOS  process. 
The  results  have  been  submitted  to  RFIC  2009. 

We  performed  a  comprehensive  theoretical  study  on  the  BAW  resonator  tuning  range.  We  originally 
proposed  to  use  multi-tank  BAW  resonators  to  obtain  a  wider  VCO  tuning  range.  However,  the  parasitic 
resistance  and  capacitance  from  CMOS  switches  on  a  common  CMOS  process  completely  defeats  the 
multi-tank  purpose,  for  the  excessive  parasitics  built  into  the  multi-resonator  tank  seriously  degrades  the 
Q  and/or  eats  up  the  available  tuning  range.  We  also  found  that  a  MOS  varactor  commonly  used  for  VCO 
tuning  is  an  extremely  inefficient  tuning  vehicle  for  BAW  resonators  because  of  its  significant  parasitic 
capacitance.  Therefore,  we  have  decided  to  use  single  BAW  resonator  tank  so  as  to  avoid  direct 
application  of  the  MOS  switches  into  the  resonator  tank  and  use  high  quality  MIMCAPs  to  tune  the 
oscillator.  We  further  explored  various  oscillator  topologies  and  concluded  that  single-ended  Pierce 
oscillator  can  provide  the  widest  tuning  range  when  tuned  with  switched  MIMCAP  arrays.  Simulation 
shows  this  can  achieve  a  tuning  range  6  times  larger  than  what  we  have  achieved  in  the  integer-N  PLL 
implementation. 

Such  a  digitally  controlled  oscillator  (DCO)  imposes  new  design  challenges  and  opens  up  a  new 
design  paradigm  for  the  phase-locked  loop.  We  designed  an  all  digital  fractional-N  phase-locked  loop 
(ADPLL)  to  lock  the  DCO  to  a  crystal  reference.  The  loop  design  focused  on  reducing  the  noise 
originated  from  digitization  and  quantization  of  various  parts  in  the  loop,  in  addition  to  the  particulars 
associated  with  the  high  Q  BAW  resonators. 

1.9.3  Research  Results  and  Discussion 

Low  power,  low  phase-noise/jitter  integer-N  frequency  synthesizer 

Figure  61  shows  the  integer-N  type  II  PLL  architecture.  For  comparison,  an  identical  LC 
PLL  was  also  implemented  on  the  same  die.  The  FBAR  VCO  demonstrates  a  power/phase-noise  figure- 
of-merit  over  30  dB  better  than  LC  oscillators  at  the  expense  of  tuning  range  [78].  As  a  result,  an  FBAR- 
based  PLL  does  not  require  a  high  loop  bandwidth  to  suppress  VCO  noise.  The  relaxed  loop  bandwidth 
specification,  in  turn,  reduces  spur  amplitudes.  The  frequency  divider  uses  an  8/9  dual-modulus  prescaler 


62 


adjustable  through  two  4-bit  programmable  counters  N1  and  N2.  The  low  power  of  the  VCO  demands 
careful  design  in  the  prescaler  to  reduce  its  relative  power  contribution.  The  dual-modulus  prescaler  is 
implemented  with  true  signal-phase-clock  (TSPC)  dynamic  logic,  demonstrating  better  power-delay 
efficiency  over  static  CMOS  implementations. 
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Figure  61.  4th  order  type  II  FBAR-based  PLL  architecture.  The  FBAR  and  LC  implementations 

are  identical  except  for  the  VCO  tank 


Figure  62.  The  schematic  of  the  charge  pump  and  the  CP  current  DAC 


The  charge  pump  (CP)  is  biased  with  a  5 -bit  current  DAC  and  an  on-chip  current  reference 
(Figure  62).  The  DAC  provides  programmability  of  the  charge/discharge  current,  a  requirement  for  loop 
stability  control  since  the  two  PLLs  (FBAR  and  LC)  have  vastly  different  VCO  gains.  To  support  a  wide 
output  voltage  swing  (and  thus  tuning  range),  the  charge  pump  employs  OTA1  to  enhance  up-  and  down- 
current  matching.  OTA2  reduces  the  effect  of  charge  injection  to  help  suppress  reference  spurs.  Both 
OTAs  use  a  folded  cascode  topology  and  consume  12pA  each  while  providing  a  voltage  gain  greater  than 
40dB.  The  loop  filter  uses  one  220pF  off-chip  capacitor.  The  FBAR  PLL  configuration  exhibits  a  loop 
bandwidth  of  10kHz  with  a  phase  margin  of  approximately  65°  (60kHz  with  a  55°  phase  margin  for  the 
LC  PLL).  Unlike  an  LCtank,  the  FBAR  presents  a  high  capacitive  impedance  at  low  frequencies  (see 
Figure  63a). 
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(a)  LC  and  BAW 


(b)  The  VCO  schematic 


Figure  63.  The  VCO  design 


Thus,  the  VCO  uses  a  common-mode  feedback  (CMFB)  loop  and  a  high-pass  negative  resistance 
response  provided  by  capacitive  source  coupling  to  ensure  low-frequency  stability  [78]  (Figure  63b). 

VCO  tuning  is  performed  with  two  small  (6pm  x  6pm)  NMOS  varactors;  excessive  capacitive  loading 
degrades  the  already  limited  tuning  range  of  the  VCO.  The  LC  VCO  exhibits  a  gain  (Hz/V)  over  80  times 
larger  than  the  FBAR  VCO.  Resistive  loading  reduces  the  Q  of  the  resonator,  ultimately  degrading  the 
phase  noise  performance  of  the  PLL.  All  resistive  loading  on  the  VCO  (primarily  from  the  CMFB 
resistors  and  divider  buffers)  was  made  greater  than  5kQ  to  avoid  significant  resonator  loading. 

The  design  was  fabricated  in  a  0.13pm  CMOS  process,  occupying  320pm  x  250pm  for  the  FBAR 
PLL  and  320pm  x  550pm  for  the  LC  PLL,  both  excluding  pads.  The  560pm  x  705pm  FBAR  die  is 
wirebonded  directly  to  the  0.13pm  CMOS  die  (Figure  64),  allowing  placement  in  a  single  IC  package. 
Alternately,  the  FBAR  die  can  be  flip-chip  bonded  onto  the  CMOS  at  the  wafer  or  die  level.  The  chip 
operates  at  a  supply  voltage  of  1.0V  and  a  total  power  dissipation  of  750pW  (500pW  for  the  VCO  and 
250pW  for  the  rest  of  the  circuitry,  including  divider,  divider  buffer,  CP,  PFD,  and  crystal  buffer).  The 
measured  settling  time  of  the  FBAR  PLL  is  600ps  with  a  CP  current  of  lOpA.  Measured  results  were 
consistent  across  all  divide  ratios  (30-130). 


Figure  64.  Chip  micrograph  of  the  FBAR  and  LC  PLLs 
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Figure  65  shows  the  phase  noise  of  the  locked  and  free  running  FBAR  VCO  measured  with  an 
Agilent  E5052B  signal  source  analyzer.  An  external  45  MHz  crystal  oscillator  was  used  as  the  reference 
with  a  divide  ratio  N=35  (frf=1575  MHz).  The  measured  crystal  reference  phase  noise,  scaled  by 
20Tog(N)  dB,  is  plotted  to  illustrate  the  reference  noise  contribution.  Additionally,  the  phase  noise  of  the 
locked  LC  PLL  is  shown  for  comparison.  The  measured  phase  noise  of  the  FBAR  PLL  is  -138dBc/Hz  at 
1MHz  offset,  37dB  lower  than  the  comparison  LC  PLL  operating  at  the  same  VCO  power  consumption. 
Close-in  phase  noise  of  the  FBAR  PLL  is  dominated  by  the  reference  phase  noise  contribution:  - 
75dBc/Hz  at  100Hz  offset,  which  is  nearly  20dB  better  than  that  of  the  LC  PLL.  Reference  spur  rejection 
is  enhanced  by  the  low  loop  bandwidth  and  careful  charge  pump  design.  The  measured  reference  spur 
level  of  the  FBAR  PLL  is  -  77dBc.  The  integrated  RMS  jitter  from  10kHz  to  10MHz  is  0.6ps  for  the 
FBAR  PLL  and  20ps  for  the  LC  PLL. 


Figure  65.  Measured  phase  noise  performance  comparison 


Figure  66.  Frequency  vs.  temperature  stability  measurements 
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The  measured  temperature  stability  of  the  FBAR  VCO  in  both  free  running  and  locked  states  is 
shown  in  Figure  66.  Mechanical  temperature  compensation  of  the  resonator  allows  locking  to  a  fixed 
reference  over  a  wide  temperature  range  even  with  a  limited  VCO  turning  range.  Table  7  provides  a 
performance  summary  of  the  FBAR  and  LC  PLLs. 


Table  7.  The  integer-N  BAW  PLL  performance  summary 


FBAR  PLL  LC  PLL  (this 

(this  work)  work) 

[79] 

Process  (CMOS) 

0.13pm 

0.13pm 

Vdd 

IV 

1.2V 

PLL  type 

type  II  integer-N 

type  II 
integer-N 

fcenter  (GHz) 

1.575 

1.89 

2.0 

fief  (MHz) 

45 

45 

250 

Loop  BW  (kHz) 

-10 

-60 

-10,000 

Tuning  range  (MHz) 

1.35 

107 

2,000 

Power 

G*w) 

VCO 

500 

500 

9,000 

Divider 

-95 

-115 

- 

PFD/CP/XTL  Buf 

-60 

-40 

- 

Divider  Buf 

-95 

-115 

- 

Total 

750 

770 

23,000 

Reference  spur  (dBc) 

-77 

-66 

-  68.5  to  48 

Phase  noise  (dBc/Hz) 

@lkHz 

-82 

-61 

-82 

@10kHz 

-85 

-62 

-106 

@100kHz 

-114 

-75 

-118 

@lMHz 

-138 

-101 

-125 

@3  MHz 

-149 

-113 

-120 

Integrated  RMS  jitter 
(ps) 

1kHz-  40MHz 

1.4 

24.8 

0.58 

lOkHz-lOMHz 

0.6 

20 

- 

A  wide  tuning,  low  power  and  low  phase  noise  DCO  anphase  noise  DCO  and  the  fractional-N  ADPLL 

We  are  currently  working  to  increase  the  tuning  range  of  the  BAW-based  VCO  (Figure  67  shows 
a  switched-resonator  configuration).  Figure  68  is  a  plot  of  Q  and  impedance  of  a  BAW  resonator  versus 
frequency  with  and  without  the  non-ideal  series  CMOS  switches.  The  CMOS  switches  contribute  a  series 
resistance  and  a  shunt  capacitance  to  the  tank.  Each  process  has  a  characteristic  switch  time  constant 
u=RonCoff.  This  tradeoff  between  on  resistance  and  off  capacitance  is  exercised  by  adjusting  the  switch 
width.  The  shunt  capacitor  reduces  the  parallel  resonant  frequency,  decreasing  the  tuning  range  of  the 
resonator  and  the  Rp  value.  The  series  resistor  reduces  the  Q  of  the  resonator  quickly  as  the  frequency 
moves  off  the  maximum  Q  frequency  during  frequency  tuning. 

Reduction  in  Rp  results  in  an  increase  in  power  consumption.  Reduction  in  Q  elevates  VCO 
phase  noise.  Our  original  goal  was  a  maximum  of  3dB  phase  noise  degradation  with  increased  tuning 
range.  Our  conclusion  is  that  that  the  switch  x  for  our  0.1 3um  CMOS  process  is  prohibitively  large  for 
allowing  switched  resonator  oscillator  tuning. 

The  large  parasitic  capacitance  of  an  on-chip  MOS  varactor  commonly  used  for  VCO  tuning 
loads  down  the  resonator  tank  as  well  when  the  varactor  is  directly  shunted  to  the  resonator  tank.  The 
sensitivity  of  a  BAW  resonator  to  the  capacitive  loading  makes  such  a  varactor  extremely  inefficient  as  a 
tuning  device.  High  quality  MIM  capacitors,  in  contrast,  are  the  closest  to  ideal  capacitors  among  all 
available  on-chip  capacitors  (MIM  caps  in  our  0.1 3um  IBM  process  exhibit  a  1%  backplate  parasitic). 
Therefore,  we  choose  to  use  switched  MIM  caps  to  maximize  tuning  range,  thus  requiring  a  digitally 
controlled  BAW  oscillator  (DCO). 
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Figure  67.  The  proposed  multitank  VCO  tuning 


Figure  68.  Q  and  Rp  change  when  on-ideal  RF  switches 


Figure  69  shows  the  proposed  DCO  of  a  single-ended  Pierce  topology.  The  switched  capacitor 
array  is  thermometer-coded,  ensuring  monolithic  frequency  tuning.  One  drawback  of  MIM  caps  is  the 
large  minimum  size,  which  introduces  DCO  quantization  noise,  degrading  PLL  phase  noise  performance. 
To  mitigate  the  problem,  we  use  two  minimum  sized  MIMCAPs  in  series  to  obtain  a  smaller  minimum 
tuning  capacitor.  Additionally,  we  reduce  frequency  granularity  by  half  through  alternate  switching  of  the 
capacitors  to  two  sides  of  the  oscillator.  Simulation  shows  that  with  the  same  level  of  power  consumption 
(~500pW),  the  achievable  tuning  range  of  the  DCO  is  over  0.5%,  six  times  larger  than  the  implementation 
of  the  cross-coupled  pair  tuned  with  a  MOS  varactor.  The  open  loop  DCO  has  a  frequency  resolution  of 
about  0.3ppm,  a  significant  number  for  the  DCO  quantization  noise  to  dominate  the  phase  noise  of  the 
PLL.  Thus,  a  second  order  delta-sigma  dithering  at  150MHz  is  deployed  on  the  DCO  control  word. 

Figure  70  shows  that  this  technique  suppresses  the  phase  noise  due  to  the  quantization  error  below  the 
phase  noise  level  determined  by  nondigitization  factors. 
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Figure  69.  The  DCO 


Figure  70.  Quantization  noise  of  open  loop  DCO  can  be  suppressed  after  control  word  dithering 

An  all  digital  fractional-N  PLL  [80]  was  used  to  lock  the  BAW  DCO  to  a  crystal  reference 
(Figure  71).  This  allows  a  direct  digital-to-digital  integration  from  the  loop  filter  to  the  DCO,  removing 
additional  power  consumption  and  additional  noise  from  analog-to-digital  conversion.  In  addition,  the 
large  capacitors  required  for  the  analog  loop  filter  will  be  replaced  with  a  digital  filter,  which  is 
significant  for  a  low  loop  bandwidth  PLL  that  typically  results  in  a  large  on-chip  capacitance.  Secondly,  a 
digital  implementation  intrinsically  has  better  immunity  to  various  noise  associated  with  the  analog 
implementation.  With  such  an  ADPLL,  we  expect  the  reference  spur  and  phase  noise  performance  at  low 
frequency  offset  will  improve,  and  consequently,  a  slightly  larger  loop  bandwidth  can  be  derived  for 
better  phase  noise  and  settling  time  performance. 


FBAR- 

DCO 


Fig.  71.  The  ADPLL  architecture 
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Test  Chips 

The  prototyping  and  testing  of  the  integer-N  based  BAW  PLL  was  finalized  and  fabricated  in 
0.13pm  CMOS  process.  The  design  achieved  phase  noise  of  -82  and  -138dBc/Hz  at  1kHz  and  1MHz 
offset,  respectively,  at  a  center  frequency  of  1 .575  GHz  with  power  consumption  of  750pW.  The  tuning 
range  of  the  PLL  is  1.3MHz.  We  performed  a  comprehensive  study  on  the  limits  of  the  tuning  range  of 
the  PLL.  We’ve  explored  the  use  of  discrete  switched  cap  arrays  to  tune  the  BAW  tank  digitally. 
Simulations  show  that  the  tuning  range  of  the  resulting  DCO  increases  about  6  times  over  the  MOSCAP 
(varactor)  tuned  VCO. 

1.9.4  Other  Results 

1.9.4.1  Technology  Transfer/Intellectual  Property 

Not  yet 

1. 9.4.2  Resulting  Publications  and  Presentations 
Not  yet 

1. 9.4.3  Benefits  to  Commercial  Sector 

The  growth  of  the  mobile  phone  market  has  been  driven  by  demands  for  more  functionality  with 
smaller  and  more  affordable  handsets.  CMOS  technology  scaling  has  enabled  continued  cost  reduction 
with  increased  functionality  for  digital  blocks  in  the  system,  yet  the  front-end  RF  transceiver  has  not 
enjoyed  similar  scaling.  One  significant  hindrance  has  resulted  from  the  realization  of  the  local 
oscillators.  This  project  entailed  an  investigation  and  design  of  a  frequency  synthesizer  utilizing  a  BAW- 
based  VCO  to  provide  a  low  power,  low  phase  noise,  and  small  implementation  area  LO  solution  to  meet 
increasingly  stringent  market  demands.  In  contrast  to  our  prior  fixed-frequency  synthesizers,  we  focused 
on  a  wide-tuning  fractional-N  EA  synthesizer  with  fine  frequency  steps.  The  overall  goal  was  to  maintain 
the  large  noise/power  performance  benefits  of  a  high  Q  synthesizer  while  allowing  wide  tuning  to 
accommodate  multiple  standards.  Our  techniques  for  bandswitching,  wide  VCO  tuning,  and  low  power 
fractional-N  synthesis  are  applicable  to  MEMS-  and  LC-  tuned  systems  alike. 

1.9.5  Conclusions 

The  power/noise  benefits  of  emerging  resonator  technologies  were  demonstrated  and  their 
practical  use  by  developing  new  PLL  topologies  with  very  low-power  consumption  and  jitter/phase  noise 
were  also  demonstrated  in  this  work.  The  emergence  and  proliferation  of  high  Q  microscale  resonators 
have,  to  date,  mainly  impacted  the  passive  RF  filter  community.  This  project  performs  the  first  study  of 
incorporating  high  Q  BAW  resonators  into  a  frequency  synthesizer.  Although  the  phase  noise  and  power 
consumption  are  expected  to  improve  dramatically,  the  VCO  tuning  range  and  PLL  loop  dynamics  also 
change,  requiring  careful  analysis  and  design  of  the  PLL  loop  dynamics.  This  project  resulted  in 
prototypes  that  validated  the  low  power,  high  Q  PLL  analysis.  This  analysis  will  help  industry  and 
defense  assess  future  passive  component  technologies  as  well  as  emerging  frequency  synthesizer 
architectures  and  can  greatly  influence  other  parts  of  wireless  systems. 

1.9.6  Students  Receiving  AFRL  Fellowships 

None 
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1.10  PROJECT  10:  WIDEBAND  LOW-POWER  DELTA-SIGMA  CONVERTER 


1.10.1  Abstract 

Novel  design  techniques  were  developed  for  low-power,  wide-band  delta-sigma  analog-digital 
converters.  Both  continuous-time  and  discrete-time  devices  were  studied.  Architectural  as  well  as 
transistor-circuit-level  innovations  were  introduced  to  reduce  power  while  maintaining  fast  operation  and 
high  resolution.  Several  test  chips  were  developed  to  verify  the  new  concepts  and  methodologies. 


1.10.2  Project  Description 

This  project  investigated  novel  design  techniques  for  low-power,  wide-band  delta-sigma  ADCs.  Both 
continuous-time  and  discrete-time  devices  were  studied.  Architectural  as  well  as  transistor-circuit-level 
innovations  were  introduced  to  reduce  power  while  maintaining  fast  operation  and  high  resolution.  The 
resulting  design  techniques  are  applicable  for  data  converters  needed  in  communication  systems,  video, 
and  RF  devices.  The  project  involves  the  introduction  of  novel  structures,  as  well  as  novel  circuitry,  to 
achieve  optimal  operation  for  delta-sigma  ADCs  in  terms  of  bandwidth,  resolution  and  power  dissipation. 
Design  techniques  for  low-power,  high-accuracy  and  wide-band  analog-to-digital  data  converters  will  be 
provided.  These  have  high  importance  in  many  military  and  consumer  electronics  applications  (cell 
phones,  digital  video),  communication  systems,  and  other  applications. 

1.10.3  Research  Results  and  Discussion 

Two  wideband,  low-power  delta-sigma  ADCs  were  designed  using  the  proposed  novel  techniques. 
They  include  a  cascade  discrete-time  wideband  delta-sigma  ADC  which  does  not  require  noise-leakage 
compensation,  and  a  cascade  continuous-time  wideband  delta-sigma  ADC  with  adaptive  digital 
suppression  of  the  first-stage  quantization  noise  leakage.  A  new  architecture  was  developed  for 
continuous-time  delta-sigma  ADCs.  It  uses  noise-coupling  technique  to  achieve  a  first-  or  higher-order 
noise-shaping  enhancement  Simulations  predict  improved  performance  compared  to  existing  structures. 
The  details  of  these  projects  are  described  below. 

1.  The  block  diagram  of  the  cascaded  discrete-time  Delta-Sigma  modulator  is  shown  in  Fig.  72.  By 
exploiting  double-sampling  technique,  the  effective  oversampling  ratio  is  doubled  which  improves 
the  modulator  SQNR  significantly;  A  simple  and  effective  capacitor  reset  technique  is  used  to  fully 
eliminate  the  noise-folding  due  to  the  capacitor  mismatch  in  the  double-sampling  feedback  DACs. 
With  multi-bit  quantizer  in  the  first  stage  modulator,  the  first-stage  quantization  noise  leakage  is 
almost  negligible  even  with  moderate  opamp  gain.  A  novel  double  sampling  comparator  with  input 
offset  cancellation  is  designed  to  enhance  the  robustness  of  the  whole  modulator  as  well  as  saving 
power.  Designed  with  a  90-nm  logic  CMOS  process  and  a  1.2V  power  supply,  the  prototype  is 
targeted  for  a  signal  bandwidth  of  20MHz,  12-bit  resolution,  and  less  than  20mW  power 
consumption.  Fig.  73  shows  the  overall  modulator  layout.  The  corresponding  post-layout  simulation 
result  is  shown  in  Fig.  74.  The  SNDR  is  71.9dB  with  a  1.5MHz  input  signal  and  320MHz  effective 
sampling  frequency. 
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Figure  72.  The  system-level  block  diagram  of  the  DT  2-2  MASH  ADC 
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[  SNDR  -  71.9  dB  ] 


Figure  74.  Post-layout  simulation  results  of  the  DT  2-2  MASH  for  a  1.5MHz,  -1.9  dBFs  input  signal 


2.  Fig.  75  shows  the  block  diagram  of  the  cascade  continuous-time  delta-sigma  modulator.  Feed¬ 
forward  architecture  is  used  in  the  first  stage,  because  it  consumes  lower  power  than  the  feedback 
structure  does.  A  first-order  passive  pre-filter  has  been  added  to  correct  for  STF  out-of-  band  peaking 
due  to  the  feed- forward  structure.  Low-distortion  topology  [81]  is  used,  modified  for  continuous-time 
delta-sigma  ADC,  so  the  second  integrator  output  can  be  used  directly  as  the  input  of  the  second-stage 
modulator  with  large  inter-stage  gain.  An  adaptive  digital  calibration  scheme  [82]  is  used  to  minimize 
the  noise  leakage  due  to  analog  circuit  imperfections,  RC  time  constant  variations,  and  component 
mismatches.  Using  a  90-nm  CMOS  technology  and  a  1.2  V  power  supply,  the  modulator  is  designed 
for  a  20  MHz  signal  bandwidth  and  12-bit  resolution,  with  20  mA  current  consumption.  Fig.  76 
shows  the  modulator  layout  .With  a  2.2  MHz  input  signal  and  a  320  MHz  clock  frequency,  62.4  dB 
SNDR  can  be  achieved  before  calibration.  The  signal  power  is  only  -30  dBFS  at  the  second  integrator 
output.  The  results  of  the  post-layout  simulation  are  shown  in  Fig.  77. 
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Figure  75.  The  system-level  block  diagram  of  the  CT  2-2  MASH  ADC 
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Figure  76.  The  layout  diagram  of  the  CT  2-2  MASH  ADC 
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Figure  77.  Post-layout  simulation  results  for  the  CT  2-2  MASH  ADC  for  a  2.2  MHz, 

0  dBFS  input  signal 
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3.  Noise-coupling  technique  has  been  successfully  applied  to  discrete-time  delta-sigma  ADC  [83-84], 
which  demonstrated  both  excellent  linearity  performance  and  good  power  efficiency.  By  using  a 
switched-capacitor  (instead  of  continuous-time)  adder,  this  technique  can  also  be  effectively  used  for 
continuous-time  delta-sigma  ADCs.  Fig.78  shows  the  block  diagram  of  a  self-coupling  continuous¬ 
time  delta-sigma  ADC,  which  injects  the  quantization  noise  at  the  adder  of  the  modulator  with  a  one- 
cycle  delay.  The  effective  noise-shaping  order  will  be  raised  by  one.  Fig.  79  shows  the  Spectre 
simulation  results  for  a  second-order  noise-coupling  continuous-time  delta-sigma  modulator.  With 
only  two  integrators,  a  third-order  noise  shaping  performance  is  achieved.  Higher-order  noise-shaping 
enhancement  technique  is  also  realizable  for  the  continuous-time  delta-sigma  modulator. 


Figure  78.  The  block  diagram  of  second-order  noise-coupled  continuous-time  delta-sigma  ADC 


-^DACl|«- 


Ts 


DAC2I 


ZH$> 


->v 


Figure  79.  The  simulated  spectrum  of  the  second-order  self-coupled  continuous-time  delta-sigma 

ADC  (OSR=8,nLev=l  7) 
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Test  Chips 

A  cascade  discrete-time  delta-sigma  ADC  without  requiring  noise-leakage  compensation  was 
designed  and  laid  out.  A  cascade  continuous-time  delta-sigma  ADC  with  adaptive  digital  calibration  for 
noise-leakage  was  also  designed  and  laid  out. 


1.10.4  Other  Results 

1.10.4.1  Technology  Transfer/Intellectual  Property 

Design  techniques  for  low-power,  high-accuracy  and  wide-band  analog-to-digital  data  converters 
transferred  to  companies  and  publications  made  available.  Numerous  presentations  given  to  industry. 

1.10.4.2  Resulting  Publications  and  Presentations 

K.Lee,  M.R.  Miller  and  G.C.Temes,  “An  8.1  mW,  82  dB  Delta-Sigma  ADC  with  1.9  MHz  BW  and  -98 
dB  THD,”  ProcJEEE  Custom  Integrated  Circuits  Conf.(CICC ),  pp  93-96,  Sep.  2008. 

Y.  Wang,  K.Lee  and  G.C.Temes,  “A  2.5MHz  BW  and  78dB  SNDR  Delta-Sigma  Modulator  Using 
Dynamically  Biased  Amplifiers,”  ProcJEEE  Custom  Integrated  Circuits  Conf.{CICC) ,  pp  97-100,  Sept. 
2008 

K.Lee,  et  al ,  “A  Noise-Coupled  Time-Interleaved  Delta-Sigma  ADC  With  4.2  MHz  Bandwidth,  -  98  dB 
THD,  and  79  dB  SNDR,”  IEEE  Journal  of  Solid-State  Circuits ,  Vol.43,  No. 12,  pp  2601-2661,  Dec.  2008 

G.C.  Temes,  “New  Architectures  for  Low-Power  Delta-Sigma  A/D  Converters,”  keynote  address,  IEEE 
APCCAS,  Macao,  China,  Nov.3-Dec.3,  2008. 

1.10.4.3  Benefits  to  Commercial  Sector 

Design  techniques  for  low-power,  high-accuracy  and  wide-band  analog-to-digital  data  converters 
will  be  provided  to  companies.  These  have  high  importance  in  consumer  electronics  (e.g.  cell  phones, 
digital  video),  communication  systems,  and  other  applications. 

1.10.5  Conclusions 

This  research  explored  novel  design  techniques  developed  for  low-power  wide-band  delta-sigma 
ADCs.  Both  continuous-time  and  discrete-time  devices  were  studied.  Architectural  as  well  as  transistor- 
circuit-level  innovations  were  introduced  to  reduce  power  while  maintaining  fast  operation  and  high 
resolution.  The  research  provided  design  techniques  for  low-power,  high-accuracy  and  wide-band  analog- 
to-digital  data  converters.  The  uniqueness  of  this  project  involved  the  introduction  of  novel  structures,  as 
well  as  novel  circuitry  to  achieve  optimal  operation  for  delta-sigma  ADCs  in  terms  of  bandwidth, 
resolution  and  power  dissipation.  These  have  high  importance  in  consumer  electronics  (e.g.  cell  phones, 
digital  video),  communication  systems,  video,  RF  devices,  and  other  applications  important  for  military 
use. 

1.10.6  Students  Receiving  AFRL  Fellowships 

None 
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2.  DOMESTIC  WORKFORCE  DEVELOPMENT 


2.1  Summary 

Students  at  all  CDADIC’s  affiliated  universities  were  awarded  one-year  (2007-2008) 
research-education  fellowships  through  the  CDADIC/AFRL  Nanoscale  Microelectronic  Circuit 
Development  grant.  Ten  (10)  AFRL  Graduate  Fellowships  and  14  AFRL  Undergraduate 
Fellowships  were  funded  through  the  domestic  workforce  development  effort  of  the  program. 
Students  participated  on  projects  described  in  the  technical  research  activities  of  this  paper.  All 
students  were  U.S.  citizens,  as  required,  and  worked  closely  with  CDADIC  faculty  on  research  in 
microelectronic  circuit  development.  The  intent  of  the  fellowships  was  to  encourage  students  to 
pursue  graduate  study  in  the  discipline  and  then  careers  in  the  field.  A  two-year  follow-up  of 
students  in  the  CDADIC/AFRL  domestic  workforce  development  program  revealed  a  very  high 
success  rate.  For  graduate  students,  five  of  the  10  students  receiving  graduate  fellowships 
continued  in  the  graduate  program,  with  all  pursuing  a  Ph.D.  in  microelectronics.  The  remaining 
five  students  graduated  and  moved  on  to  careers  in  the  microelectronics  field.  Of  the  24 
undergraduate  students  who  received  AFRL  Fellowships,  7  continued  on  to  graduate  school  and 
7  are  now  working  in  the  industry,  both  in  the  commercial  and  defense  sectors. 

2.2  Introduction 

The  purpose  of  the  domestic  workforce  development  program  under  the  CDADIC/AFRL 
Nanoscale  Microelectronic  Circuit  Development  grant  was  to  attract  more  U.S.  citizen  graduate 
and  undergraduate  students  to  universities  at  the  forefront  in  this  engineering  field.  The  shortage 
of  U.S.  graduate  students  in  this  discipline  not  only  affects  the  ability  of  universities  to  conduct 
important  research,  but  also  hampers  the  ability  of  academia  to  produce  a  highly  educated, 
sustainable  workforce  to  meet  future  research  needs.  With  the  high  concentration  of  international 
graduate  students  in  many  U.S.  electrical  engineering  programs  today,  it  has  been  increasingly 
difficult  for  some  universities  to  conduct  research  in  newly-defined  sensitive  areas.  With 
tightened  restrictions  governed  by  the  federal  Export  Administration  Regulations  (EAR)  and 
International  Traffic  in  Arms  Regulations  (ITAR),  there  is  a  critical  need  for  universities  to  have 
the  resources  necessary  to  recruit  more  U.S.  students  into  engineering  programs  focusing  on 
mixed-signal  circuit  design  technology. 

The  domestic  workforce  development  plan  established  in  this  initiative  included  both 
undergraduate  and  graduate  students.  The  intent  was  to  encourage  undergraduates  to  pursue 
graduate  study  and  careers  in  the  field  of  microelectronics,  and  to  provide  graduate  students  with 
research  opportunities  conducted  by  leading  faculty  researchers  and  gain  real-world  experience 
from  center  industry  members. 

One-year  stipends  were  awarded  to  24  students  represented  by  all  CDADIC  universities. 
Students  were  nominated  by  CDADIC  faculty,  with  final  selections  made  by  the  CDADIC 
Executive  Committee.  Fourteen  (14)  undergraduate  fellowships  and  10  graduate  fellowships 
were  provided  to  students  who  were  U.S.  citizens.  Each  student  was  mentored  by  an  established 
CDADIC  research  faculty  member  and  participated  in  the  research  activities  described  in  the 
technical  research  of  this  program,  as  described  in  Part  I  of  this  report.  Table  8  lists  students 
receiving  fellowships,  their  affiliated  university,  educational  level,  advisor,  and  research  theme 
of  their  project.  As  a  final  requirement,  all  students  prepared  a  short  paper  at  the  end  of  their  term 
describing  their  research  activities  and  fellowship  experiences  (Appendix  A:  student  reports). 
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Table  8.  Student  Fellowship  Recipient,  University,  Advisor,  and  Project,  2007-2008 


AFRL  Graduate  Fellowships:  10 

AFRL  Undergraduate  Fellowships:  14 

Conrad  Donovan  (Washington  State  University) 
M.S.  Student;  Advisor:  Prof.  DeukHeo 

Project:  Power  Management  System  for  Energy  Harvesting 

Jeremy  Asmussen  (Washington  State  University) 
B.S.  Student;  Advisor:  Prof.  Deuk  Heo 

Project:  Offset  Bias  Voltage  Bulk  Downcon version  Mixer  RF 

Steven  Gaskill  (Oregon  State  University) 

M.S. Student;  Advisor:  Prof  Andreas  Weisshaar 

Project:  Electrical  Models  of  Dummy  Metal  Fill 

Michael  Barr  (Washington  State  University) 

B.S.  Student;  Advisor:  Prof.  Deuk  Heo 

Project:  PFD  and  CP  Design  for  PLL 

Mark  Hale  (University  of  Tennessee) 

Ph.D.  Student;  Advisor:  Prof.  Ben  Blalock 

Project:  A  Digital-to- Analog  Converter  Architecture  for  Multi- 
Channel  Applications 

Bill  Biederman  (University  of  Washington) 

B.S.  Student;  Advisor:  Prof.  Brian  Otis 

Project:  A  Self-Regulating,  Gm-Boosted, 

Complementary  Class-C  CMOS  LC-VCO 

Bill  Hamon  (Washington  State  University) 

Ph.D.  Student;  Advisor:  Prof.  George  La  Rue 

Project:  Digitally  Controlled  Synthesizer  in  90  nm  CMOS 

Grayson  Dietrich  (University  of  Washington) 

B.S.  Student;  Advisor:  Prof.  Bruce  Darling 

Project:  Advanced  Gate  Models  for  Deep  Submicron  CMOS 

Adam  Heiberg  (Oregon  State  University) 

M.S.  Student;  Advisor:  Prof.  Karti  Mayaram 

Project:  Low  Power  Integrated  GPS  Front  End 

Brian  Drost  (Oregon  State  University) 

B.S.  Student;  Advisor:  Prof.  Hanumolu 

Project:  On-Chip  Sampler 

Julie  Hu  (University  of  Washington) 

Ph.D.  Student;  Advisor:  Prof.  Brian  Otis 

Project:  Low  Power  BAW-Based  PLLs/Tunable  LNAs 

Saeed  Ghezawi  (University  of  Tennessee) 

B.S.  Student;  Advisor:  Prof.  Ben  Blalock 

Project:  A  Multi-channel  D/A  Converter 

Dirk  Robinson  (Washington  State  University) 

Ph.D.  Student;  Advisor:  Prof.  George  La  Rue 

Project:  High-Speed,  Low-Noise  ADC  in  SiGe 

Jon  Grueber  (Oregon  State  University) 

B.S.  Student;  Advisor:  Prof.  Moon 

Project:  AFRL  Oscilloscope  on  a  Chip 

James  Vandersand  (University  of  Tennessee) 

Ph.D.  Student;  Advisor:  Prof.  Ben  Blalock 

Project:  An  Analog  Multiphase  Self-Calibrating  DLL 

Daniel  Hubert  (Washington  State  University) 

B.S.  Student;  Advisor:  Prof.  La  Rue 

Project:  High-Speed  FPGA  Test  Board 

Skyler  Weaver  (Oregon  State  University) 

Ph.D.  Student;  Advisor:  Prof.  Un-Ku  Moon 

Project:  Stochastic-Passive  AD  Techniques  Submicron  CMOS 

Fedja  Karalic  (University  of  Washington) 

B.S.  Student;  Advisor:  Prof.  Brian  Otis 

Project:  Simulating  and  Designing  On-Chip  Inductors 

Brian  Young  (Oregon  State  University) 

Ph.D.  Student;  Advisor:  Prof.  Pavan  Hanumolu 

Project:  Digitallv  Enhanced  High-Speed  Links 

Chee-Sing  Lee  (Oregon  State  University) 

B.S.  Student;  Advisor:  Prof.  Weisshaar 

Project:  Coupling  Suppression  in  Integrated 

Circuits  Using  Dummy  Metal  Fill 

Kevin  Omoumi  (University  of  Tennessee) 

B.S.  Student;  Advisor:  Prof.  Ben  Blalock 

Project:  A  Multi-channel  D/A  Converter 

Zack  Pannell  (University  of  Tennessee) 

B.S.  Student;  Advisor:  Prof.  Ben  Blalock 

Project:  A  Multi-channel  D/A  Converter 

Richard  Przybyla  (Oregon  State  University) 

B.S.  Student;  Advisor:  Prof.  Hanumolu 

Project:  DAC  Implementation  in  an  On-Chip  Calibration 

Adrianne  Thrash  (University  of  Tennessee) 

B.S.  Student;  Advisor:  Prof.  Ben  Blalock 

Project:  A  Multi-channel  D/A  Converter 
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2.3  Student  Fellowship  Program 
Undergraduate  Fellowship  Program 

Research  experience  is  considered  to  be  one  of  the  most  effective  approaches  for 
attracting  and  retaining  talented  undergraduates  into  a  particular  discipline.  Even  for  students 
who  are  pursuing  undergraduate  degrees  in  a  specific  engineering  discipline,  like  electrical 
engineering,  undergraduates  frequently  have  little  knowledge  about  careers  in  the  sub-disciplines 
such  as  electronics.  By  involving  talented  undergraduate  students  in  ongoing  microelectronic 
research  projects  that  feature  high-quality  interaction  of  these  students  with  faculty  and  other 
research  mentors,  including  graduate  students,  there  is  a  high  probability  that  these  students  will 
pursue  graduate  study  in  a  specialization  that  relates  closely  to  their  undergraduate  research 
experience. 

The  major  goals  of  the  CDADIC/AFRL  undergraduate  research  opportunity  included: 

•  Present  students  with  challenging  technical  problems  that  will  help  them  develop 
problem-solving,  critical  thinking,  and  creativity  skills. 

•  Improve  the  participant’s  understanding  of  microelectronic  circuit  design  and 
associated  technology  by  involving  them  in  cutting-edge  research  projects  with 
state-of-the-art  equipment  in  collaboration  with  established  researchers. 

•  Generate  interest  among  these  participants  for  pursuing  graduate  education  and 
provide  them  with  the  background  necessary  to  make  a  successful  graduate  school 
applicant. 

Requirements  included  providing  stipends  for  these  students,  which  were  used  either 
during  the  summer  or  academic  year.  Awards  were  only  given  to  students  who  were  mentored  by 
established  CDADIC  research  faculty.  As  the  intent  of  this  program  was  to  develop  the  domestic 
workforce,  eligible  students  were  required  to  be  U.S.  citizens.  A  block  of  funding  was  made 
available  to  each  participating  CDADIC  university,  who  in  turn  selected  participants  for  this 
program  at  their  respective  institution.  The  final  selections  were  made  by  the  CDADIC 
Executive  Committee.  Participants  in  this  program  attended  CDADIC’s  semi-annual,  national 
meetings  to  share  their  experiences  and  research  results  with  their  peers  and  with  industry 
representatives.  All  students  were  required  to  present  posters  at  the  meetings.  In  addition,  each 
student  was  required  to  write  a  short  paper  on  their  research  activity  at  the  end  of  their  term. 

To  further  the  educational  and  real-world  industry  experience  for  undergraduates  in  this 
program,  internships  were  encouraged,  especially  at  the  junior  level.  Center  industry  members 
were  helpful  in  placing  students  in  internship  programs. 

Graduate  Fellowship  Program 

This  program  was  directed  at  meeting  the  challenge  of  recruiting  and  retaining  talented 
U.S.  citizens  as  graduate  students  in  the  microelectronic  technology  disciplines.  In  addition  to 
capturing  the  interests  of  students  so  they  might  consider  careers  in  microelectronics,  it  was  also 
the  intent  to  encourage  students  to  continue  their  study  as  graduate  students  in  the  discipline. 

This  has  become  increasingly  important  as  the  practice  of  analog  or  mixed-signal  circuit  design 
requires  formal  instruction  beyond  what  students  can  obtain  as  undergraduates.  One  of  the  major 
deterrents  for  U.S.  citizen  students  to  pursue  study  beyond  the  baccalaureate  is  economics.  Upon 
graduation,  an  undergraduate  student  can  choose  to  seek  employment  or  remain  in  the  role  of  a 
student.  In  general,  graduate  stipends  have  not  kept  pace  with  the  starting  salary  of  a  student  with 
a  bachelor’s  degree.  A  rule  of  thumb  that  has  been  used  over  the  years  for  graduate  student  pay 
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has  been  that  competitive  stipends  should  be  approximately  one-half  of  the  starting  salary  of 
employees  with  a  baccalaureate  degree.  This  is  not  an  issue  for  international  students  who 
receive  the  same  graduate  stipend  as  U.  S.  students.  For  these  students,  the  stipend  level  appears 
attractive  when  compared  to  the  salary  they  could  earn  at  home.  Hence,  the  scope  of  this 
program  aimed  to  provide  supplements  to  standard  graduate  student  stipend  levels  in  the  form  of 
a  fellowship  available  only  to  U.S.  citizens. 

To  successfully  recruit  and  retain  U.S.  citizen  students  as  graduate  students  in  the 
microelectronic  circuit  design  discipline  required  that  funding  be  set  aside  as  independent 
fellowships  to  be  awarded  on  a  competitive  basis  to  qualified  students  that  are  actively  engaged 
in  CDADIC  related  research.  The  intent  would  be  that  a  standard  individual  CDADIC/AFRL 
fellowship  award  amount  would  be  established  that  would  serve  to  attract  high  quality  students 
into  the  program.  The  requirements  and  selection  process  was  similar  as  the  undergraduate 
fellowship  awards,  where  a  block  of  funding  was  made  available  to  each  participating  CDADIC 
university,  who  in  turn  selected  participants  for  the  program  at  their  respective  institution.  The 
final  selections  were  again  made  by  the  CDADIC  Executive  Committee.  Participants  in  this 
program  attended  CDADIC ’s  semi-annual,  national  meetings  where  they  gave  presentations  on 
their  research  activities,  shared  research  results  with  others  via  posters  and  numerous  networking 
sessions.  All  students  were  required  to  present  posters  at  the  meetings.  In  addition,  each  student 
was  required  to  write  a  short  paper  on  their  research  activity  at  the  end  of  their  term. 

2.4  Results  and  Conclusions 

The  CDADIC/AFRL  Fellowship  program  was  highly  successful  in  achieving  its 
objective,  which  was  to  encourage  fellowship  recipients  to  continue  graduate  study  in  the 
discipline  and  eventually  to  pursue  careers  in  the  field.  Table  9  reports  the  status  of  students  after 
they  finished  their  fellowship  program.  For  graduate  students,  five  of  the  10  students  receiving 
graduate  fellowships  continued  in  the  graduate  program,  with  all  pursuing  a  Ph.D.  in 
microelectronics.  The  remaining  five  students  graduated  and  moved  on  to  careers  in  the 
microelectronics  field.  Of  the  24  undergraduate  students  who  received  AFRL  Fellowships,  7 
continued  on  to  graduate  school  and  7  are  now  working  in  the  industry,  both  in  the  commercial 
and  military/defense  sectors.  In  summary,  the  CDADIC/AFRL  Domestic  Workforce 
Development  Effort  was  a  complete  success  and  met  its  goal,  since  all  24  students  in  the 
program  either  are  continuing  their  education  in  graduate  school  or  are  now  working  as  engineers 
in  the  commercial  and  defense  sectors.  The  continuation  of  this  program  is  highly  recommended. 
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Table  9.  Current  Status  of  AFRL  Student  Fellowship  Recipients,  2010 


AFRL  Graduate  Fellowships:  10 
CURRENT  STATUS  (2010) 

AFRL  Undergraduate  Fellowships:  14 
CURRENT  STATUS  (2010) 

Conrad  Donovan  (Washington  State  University) 
Current  Status:  Ph.D  program  at  WSU 

Jeremy  Asmussen  (Washington  State  University) 

Current  Status:  CIA 

Steven  Gaskill  (Oregon  State  University) 

Current  Status:  Ph.D  program  at  OSU 

Michael  Barr  (Washington  State  University) 

Current  Status:  Ph.D  program  at  WSU 

Mark  Hale  (University  of  Tennessee) 

Current  Status:  Cadence  Design  Svstems 

Bill  Biederman  (University  of  Washington) 

Current  Status:  Ph.D  program  at  UC  Berkelev 

Bill  Hamon  (Washington  State  University) 

Current  Status:  RVJ  Technologies 

Grayson  Dietrich  (University  of  Washington) 
Current  Status:  Algas  SDI 

Adam  Heiberg  (Oregon  State  University) 

Current  Status:  Azurav  Technologies 

Brian  Drost  (Oregon  State  University) 

Current  Status:  Ph.D  program  at  OSU 

Julie  Hu  (University  of  Washington) 

Current  Status:  Ph.D  program  at  UW 

Saeed  Ghezawi  (University  of  Tennessee) 

Current  Status:  Schneider  Electric 

Dirk  Robinson  (Washington  State  University) 
Current  Status:  Advanced  Micro  Devices 

Jon  Grueber  (Oregon  State  University) 

Current  Status:  Ph.D  program  at  OSU 

James  Vandersand  (University  of  Tennessee) 
Current  Status:  Cadence  Design  Svstems 

Daniel  Hubert  (Washington  State  University) 
Current  Status:  Naval  Sea  Svstems  Command 

Skyler  Weaver  (Oregon  State  University) 

Current  Status:  Ph.D  program  at  OSU 

Fedja  Karalic  (University  of  Washington) 

Current  Status:  Engineer  in  industry 

Brian  Young  (Oregon  State  University) 

Current  Status:  Ph.D  program  at  OSU 

Chee-Sing  Lee  (Oregon  State  University) 

Current  Status:  Ph.D  program  at  OSU 

Kevin  Omoumi  (University  of  Tennessee) 

Current  Status:  Ph.D  program  at  UT 

Zack  Pannell  (University  of  Tennessee) 

Current  Status:  NASA’s  Jet  Propulsion  Lab 

Richard  Przybyla  (Oregon  State  University) 

Current  Status:  Ph.D  program  at  UC  Berkelev 

Adrianne  Thrash  (University  of  Tennessee) 

Current  Status:  Naval  Air  Svstems  Command 
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Power  Management  System  for  Environmental  Energy  Harvesting 

Conrad  Donovan 


Wireless  sensors  have  been  attractive  in  wide  range  of  applications  such  as  environmental 
monitoring,  oceanographic  study,  and  military  tactical  surveillance  for  real  time  data  acquisition  from 
remote  locations.  In  last  few  decades,  wireless  networks  and  related  electronics  have  been  advanced 
significantly.  Despite  significant  advancement  of  electronic  components  the  lifetime  of  most  electronic 
devices  is  limited  by  the  power  source.  For  electronic  devices  in  which  connection  to  the  power  grid  is 
impractical  or  impossible,  the  use  of  dry  cell  batteries  is  popular,  but  the  problem  with  the  dry  cell 
batteries  are  that  the  battery  life  is  finite.  Replacing  or  recharging  the  batteries  can  be  costly,  time- 
consuming,  has  environmental  costs  and  can  result  in  downtime. 

As  an  alternative,  there  are  many  existing  technologies  that  can  harvest  energy  from  the  local 
environment.  A  limited  set  of  examples  include  the  use  of  piezoelectric  materials  for  harvesting  kinetic 
energy,  solar  panels  and  microbial  fuel  cells.  Currently  there  is  a  need  in  the  art  for  systems  and  methods 
for  the  effective  management  and  storage  of  this  harvested  energy. 

The  term  “Inductor”  as  used  herein,  refers  to  a  wire  coil/magnet  coupling.  The  value  of  the 
inductor  can  be  modified  based  on  the  needs  of  the  power  management  system.  The  value  of  the  inductor 
has  a  large  impact  on  the  maximum  allowable  current  that  the  PMS  can  source  to  the  connected  power 
consuming  device,  and  also  the  overall  efficiency  of  the  PMS. 

The  term  “Flyback  Diode”  as  used  herein,  refers  to  a  diode  used  in  boost  converters  to  ensures 
that  the  charge  always  flows  from  the  Power  Management  System  (PMS)  to  the  Connected  Power 
Consuming  Device  and  not  backwards.  The  diode  size  can  be  varied  to  handle  larger  currents  flowing 
through  it  and  allow  high  power  devices  to  be  operated  from  the  PMS.  The  term  “Connected  Power 
Consuming  Device”  as  used  herein  refers  to  any  device  which  is  powered  by  the  Power  Management 
System. 

The  term  “Charge  Pump”  as  used  herein,  refers  to  a  standard  industry  charge  pump.  The  model 
used  in  the  circuit  is  chosen  mainly  based  on  the  voltage  available  from  the  environmental  energy  source 
(Vi  -  see  Figure  A-l).The  term  “supercapacitor”  as  used  herein,  refers  to  a  capacitor  that  has  a  very  high 
dielectric  constant.  The  supercapacitor  is  used  to  store  an  amount  of  energy  needed  to  operate  the  PMS 
and  connected  power  consuming  device. 


Figure  A-l.  Block  Diagram  of  Power  Management  System 

The  term  “Switching  Regulator”  as  used  herein,  refers  to  a  standard  CMOS  regulator  using  a  built 
in  voltage  reference  (determined  by  circuit  needs)  to  regulate  the  output  voltage  (V2  -  Figure  A-l)  by 
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means  of  pulse  width  modulation  (PWM),  or  pulse  frequency  modulation  (PFM).  The  Switching 
Regulator  model  needed  is  determined  by  the  power  requirements  of  Connected  Power  Consuming 
Device  and  the  maximum  output  voltage  of  the  Charge  Pump. 

The  Power  Management  System  consists  essentially  of  a  supercapacitor,  Inductor,  Flyback 
Diode,  Feedback  Diode,  Charge  Pump,  and  a  Switching  Regulator. 

The  potential  environmental  energy  source  is  connected  directly  to  a  supercapacitor  at  the  input 
port  of  the  PMS.  As  the  energy  is  transferred  from  the  environmental  energy  source  (Vi)  and  stored  in 
the  supercapacitor,  the  voltage  across  the  dielectric  within  the  supercapacitor  will  rise.  While  this  occurs, 
the  charge  pump  will  begin  operation  boosting  the  potential  and  storing  the  charge  on  a  separate  low 
value  capacitor.  Once  the  voltage  reaches  the  charge  pump  input  voltage  specification,  the  charge  pump 
will  close  an  internal  switch  allowing  charge  to  flow  from  the  charge  pump  capacitor  to  the  input  port  of 
the  switching  regular  effectively  jump  starting  it.  The  charge  pump  has  an  internal  voltage  comparator 
which  is  connected  to  an  internal  reference  voltage.  To  ensure  enough  charge  is  available  to  operate  the 
connected  power  consuming  device,  the  feedback  diode  delays  the  jump  start  of  the  switching  regulator 
by  dumping  charge  from  the  charge  pump  capacitor  to  the  supercapacitor,  The  characteristic  forward 
voltage  of  the  feedback  diode  will  determine  the  amount  charge  stored  on  the  supercapacitor  the  moment 
that  the  charge  pump  jump  starts  the  switching  regulator.  Depending  on  the  connected  power  consuming 
device,  the  supercapacitor  value  can  be  increased  or  decreased  and  the  feedback  diode  can  be  selected  to 
have  a  lower  characteristic  forward  voltage,  ensuring  that  enough  energy  is  available  on  the 
supercapacitor  to  power  the  connected  power  consuming  device.  Once  the  switching  regulator  is 
powered,  the  charge  pump  is  turned  off  to  save  energy. 

The  switching  regulator  operates  by  providing  a  pulse  width  modulated  (PWM),  or  pulse 
frequency  modulated  (PFM),  signal  to  an  internal  power  MOSFET.  The  duty  cycle  (PWM),  or  frequency 
(PFM),  of  the  signal  is  modulated  depending  on  the  output  voltage.  This  process  controls  the  change  in 
current  through  the  inductor  which  is  proportional  to  the  amount  of  energy  stored  in  the  inductor  and  the 
DC  voltage  across  its  windings.  The  output  capacitor  stabilizes  the  ripple  voltage. 

Depending  on  the  quality  of  environmental  energy  source  and  the  value  of  the  supercapacitor,  the 
cycle  time  for  operating  the  circuitry  will  vary.  Since  the  supercapacitor  voltage  is  dropping  steadily 
during  active  stage,  eventually  the  power  available  will  not  be  enough  for  the  switching  regulator  to 
maintain  the  output  voltage  (V2)  requirement.  During  this  time  the  voltage  at  the  output  port  (V2),  will 
decrease,  and  eventually  turn  off  the  switching  regulator.  If  a  large  portion  of  the  energy  stored  on  the 
super  capacitor  was  used,  the  charge  pump  will  also  be  turned  off.  This  will  restart  the  charging  cycle 
and  the  Power  Management  System  will  wait  until  enough  energy  is  available  in  the  supercapacitor  to 
jump  start  the  switching  regulator  and  repower  Connected  Power  Consuming  Device. 

Replacing  or  recharging  the  batteries  can  be  costly,  time-consuming,  has  environmental  costs  and 
can  result  in  downtime.  As  an  alternative,  there  are  many  existing  technologies  that  can  harvest  energy 
from  the  local  environment.  The  power  management  system  described  here  can  be  used  to  manage 
energy  harvested  from  the  environment  and  operate  an  electronic  device. 
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Electrical  Models  of  Dummy  Metal  Fill 

Steven  Gaskill 


Abstract 

First  of  all  I  would  like  to  thank  the  AFRL  for  the  opportunity  of  the  AFRL  research  fellowship. 
The  fellowship  allowed  me  to  spend  time  on  research  activities.  My  research  during  the  fellowship  has 
focused  on  the  use  and  modeling  of  metal  fills.  Our  main  goal  has  been  to  help  the  design  and 
performance  of  ICs  fabricated  in  advanced  processes  that  require  metal  fill.  We  have  devised  and  studied 
design  curves  for  metal  fill  in  the  plane  of  signal  traces.  This  includes  grounding  patterns,  buffer 
distances,  metal  fill  shapes  and  finite  ground  impedance  effects.  We  have  also  looked  at  off-plane  design 
curves.  More  recently  we  have  looked  at  high  frequency  inductive  effects.  We  have  also  laid  out  a  test 
chip  which  will  test  the  actual  performance  of  some  of  our  proposed  design  strategies.  Recently,  we  have 
developed  a  unique  method  for  calculating  the  effect  of  metal  fill  using  an  effective  polarization  field  and 
shape  factors  resulting  in  a  closed-form  formula.  This  method  has  been  tested  and  used  to  improve  the 
performance  of  MIM  capacitors.  Our  work  has  resulted  in  two  publications  and  we  currently  have  one 
more  pending  approval  [1,2]. 

Research 

Design  Strategies  for  In-Plane  Metal  Fill  Metal  fill  is  present  for  manufacturing 
purposes  such  as  planarization  and  uniformity.  Most  often  metal  fill  is  treated  as  a  parasitic  to  be 
minimized.  We,  however,  looked  at  how  to  effectively  use  metal  fill  as  a  shielding  structure  to 
improve  performance.  We  found  that  grounding  only  a  portion  of  metal  fill  was  best  since  you 
get  diminishing  returns.  The  next  question  we  answered  was  which  metal  fill  should  be 
grounded.  Here  we  found  that  the  highest  isolation  for  a  given  loading  was  when  metal  fill  was 
first  grounded  away  from  signal  traces  and  progressively  moved  toward  the  traces.  We  also 
looked  at  shapes  of  metal  fill  where  we  found  that  combinations  of  metal  fill  lines  and  squares 
were  the  best  selection.  Furthermore,  we  investigated  how  to  select  the  buffer  distance.  The 
buffer  distance  is  the  separation  between  active  traces/devices  and  metal  fill  patterning.  It  was 
found  that  electrically  the  largest  buffer  distance  is  best  in  terms  of  providing  isolation  without 
loading. 

Design  Strategies  for  Off-Plane  Metal  Fill  The  strategies  for  off-plane  metal  fill  showed 
that  in  some  circumstances  it  didn’t  matter  which  metal  fill  were  grounded  first.  We  found  that  in 
the  cases  where  it  did  make  a  difference  it  was  again  best  to  ground  away  from  the  traces  before 
grounding  metal  fill  closer  to  active  lines.  This  maximized  the  isolation  while  keeping  the 
loading  on  the  lines  low.  We  are  still  investigating  the  shapes  of  metal  fill  in  off-plane  but 
preliminary  results  show  that  it  is  very  similar  to  in-plane  case.  There  is  no  buffer  distance 
defined  for  the  off-plane  case. 

Finite  Ground  Impedance  We  also  came  up  with  a  matrix  method  to  reduce  the 
capacitance  matrix  when  the  ground  connections  are  not  ideal.  We  looked  at  inductive  and 
resistive  effects  of  ground  impedance.  What  we  found  was  that  even  in  the  worst  case  scenario  of 
50Q  the  impedance  made  virtually  no  difference  up  to  30GHz.  This  means  that  even  if  the  traces 
are  long  it  is  better  to  ground  certain  metal  fill. 

Inductive  Effects:  There  are  two  inductive  effects  we  have  investigated.  First  we  looked 
at  what  effect  distributing  inductance  on  our  lines  did  to  our  conclusions  of  shielding  using  metal 
fill.  We  found  that  for  a  1mm  long  line  it  is  okay  to  model  as  a  distributed  RC  line  and  not  RLC 
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line  up  to  ~20GHz.  Second,  we  looked  at  the  effect  of  metal  fill  on  inductors.  The  metal  fills  add 
loss  to  inductors  which  reduce  Q.  We  found  that  it  may  be  better  to  group  small  sized  metal  fill 
together  to  meet  minimum  density  requirements  rather  than  using  the  default  tiling. 

Test  Chip  To  test  our  conclusions  about  metal  fill  used  for  shielding  and  its  effects  on 
inductors  we  have  laid  out  a  test  chip.  The  chip  includes  6  MIM  capacitors,  12  spirals,  and  7 
transmission  lines  all  repeated  twice,  as  well  as  de-embedding  structures. 

Closed-Form  Equation  for  Metal  Fill  Impact:  We  have  recently  generated  a  closed- 
form  semi-empirical  formula  to  account  for  metal  fills  capacitive  effect  on  MIM  capacitors.  This 
resulted  in  a  submission  to  RFIC  2009.  The  formula  was  found  to  be  accurate  with  an  error  of 
less  than  1%  over  a  wide  manufacturable  range  of  a  0.18pm  process.  It  also  lead  us  to  discover 
that  large  square  metal  fill  gave  the  lowest  increase  in  substrate  capacitance  of  MIM  capacitors. 

Conclusion  and  Benefits 

This  work  will  hopefully  result  in  many  benefits  to  designers  using  metal  fill.  It  provides 
an  increased  awareness  and  understanding  of  electrical  effects  of  metal  fill  in  modem  ICs.  It 
provides  formulas  and  tools  to  account  for  the  impact  of  metal  fill  on  performance  degradation, 
specifically  increased  capacitive  effects.  The  research  provides  layout  approaches  that  include 
metal  fill  for  improved  isolation.  Overall  it  helps  improve  the  design  and  performance  of  ICs 
fabricated  in  advanced  processes. 


Resulting  Publications 

1  S.  Gaskill,  V.  Shilimkar,  and  A.  Weisshaar,  “Noise  Suppression  in  VLSI  Circuits  Using  Dummy 
Metal  Fill,”  12th  IEEE  Workshop  on  Signal  Propagation  on  Interconnects,  Avignon,  France,  May 
2008;  proceedings,  2008. 

2  S.  G.  Gaskill,  V.  S.  Shilimkar,  and  A.  Weisshaar,  “Isolation  Enhancement  in  Integrated  Circuits 
Using  Dummy  Metal  Fill,”  2008  IEEE  Radio  Frequency  Integrated  Circuits  Symposium  Digest  of 
Papers,  Atlanta,  GA,  2008 
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A  Digital-to-Analog  Converter  Architecture  for  Multi-Channel  Applications 
and  An  Analog  Multiphase  Self-Calibrating  DLL  to  Minimize  the  Effects  of 
Process,  Supply  Voltage,  and  Temperature  Variations 

Mark  Hale  and  James  Vandersand 


Research  Summaries 

Mark  Hale’s  AFRL  Fellowship  research  effort  was  toward  his  dissertation  titled  “A  Digital-to- 
Analog  Converter  Architecture  for  Multi-Channel  Applications.”  This  research  was  prompted  by  a 
variety  of  systems-on-chip  applications  needing  the  capability  of  driving  multiple  analog  voltages.  These 
applications  include  multiple  actuator  control  for  robotics  applications,  automated  test  equipment 
systems,  industrial  automation,  programmable  logic  controllers,  and  satellite  flywheel  motor  control. 

Such  applications  require  a  DAC  for  each  analog  output.  Furthermore,  a  multi-channel  architecture  that 
saves  power  and  area  by  sharing  hardware  is  needed. 

Mark  Hale’s  work  introduced  a  new  single-ramp  multi-channel  12-bit  DAC  architecture.  The 
architecture  includes  a  low  power  Gray  code  counter,  ramp  generator,  digital  comparator,  analog  memory 
units,  and  control  logic.  The  new  multi-channel  DAC  architecture  allows  hardware  sharing  between 
multiple  channels,  and  enables  Systems-on-Chip  to  have  multiple  analog  outputs  for  stimulating 
transducers  or  motors.  Mark  Hale  completed  his  Ph.D.  in  May  2008  and  now  works  for  the  Cadence 
Mixed-Signal  Design  Center  in  Cary,  NC. 

James  Vandersand’s  AFRL  Fellowship  research  effort  was  toward  his  dissertation  titled  “An 
Analog  Multiphase  Self-Calibrating  DLL  to  Minimize  the  Effects  of  Process,  Supply  Voltage,  and 
Temperature  Variations.”  Delay  locked  loops  (DLLs)  find  use  in  a  broad  range  of  applications  including 
computing,  time-to-digital  converters  (TDCs),  and  communications.  James’s  research  focused  on  the 
design  and  development  of  a  DLL  circuit  with  minimal  sensitivity  to  variations  in  fabrication  process, 
supply  voltage,  and  temperature  (PVT). 

James  Vandersand’s  research  resulted  in  a  self-calibrating  DLL  that  included  an  all  digital 
calibration  circuit,  as  well  as  a  system  transient  monitor.  The  coarse  calibration  helps  minimize  global 
process,  voltage,  and  temperature  errors  for  an  analog  multiphase  DLL.  The  system  monitor  is  used  to 
detect  any  transients  that  might  cause  the  DLL  to  unlock,  which  could  be  used  to  allow  the  DLL  to  be 
recalibrated  to  the  new  environmental  conditions.  Measurement  results  demonstrated  that  the  DLL  could 
potentially  be  used  in  extreme  environments  such  as  space,  or  other  extreme  environment  applications. 

Dr.  Vandersand  completed  his  Ph.D.  in  May  2008  and,  just  like  Dr.  Hale,  also  works  for  the  Cadence 
Mixed-Signal  Design  Center  in  Cary,  NC. 

Intensive  experimental  work,  including  extensive  test,  measurement  and  characterization  was 
required  to  support  the  aforementioned  research  in  the  Integrated  Circuits  and  Systems  Laboratory 
(ICASL).  Thanks  to  the  AFRL  Fellowship  support,  undergraduate  students  Zack  Pannell  and  Kevin 
Omoumi  were  able  to  make  significant  contributions  to  ICASL’ s  experimental  work  through  test  board 
design  and  implementation,  as  well  as  assisting  in  test  and  characterization. 
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Digitally  Controlled  Synthesizer  in  90  nm  CMOS 

Bill  Hamon  and  Prof.  George  S.  La  Rue 


Abstract 

A  digitally  controlled  synthesizer  (DCS)  using  a  delay  accumulator  and  a  frequency  divider  is 
presented.  The  system  operates  with  an  output  tuning  range  of  2.44  MHz  to  2.5  GHz  using  a  5  GHz 
reference  clock  with  a  power  consumption  of  125  mW.  It  is  designed  in  the  IBM  90  nm  CMOS  process. 
The  mostly  digital  design  has  no  jitter  accumulation,  high  tolerance  to  device  and  process  variations,  and 
a  small  form  factor.  The  novel  delay  accumulator  prevents  the  need  of  propagating  carries  reducing  power 
dissipation  and  area.  The  design  is  tolerant  to  total  ionizing  dose  radiation  and  single-event  upsets.  The 
design  and  the  layout  of  each  component  are  completed  but  the  connections  between  components  have 
not.  This  fellowship  supported  Bill  after  funding  ran  out  on  the  project  supported  by  AFRL  through 
CDADIC. 


Project  Description 

The  DCS  [1]  requires  an  input  reference  clock  with  low  jitter  and  a  digital  word  representing  the 
period  T  of  the  output  clock  divided  by  the  reference  clock  period  Tdk.  The  toggle  period  of  the  output  is 
T  =  2(N  +  R)T 

determined  by  *  J  clk  where  N  is  a  positive  integer  and  R  is  the  fractional  remainder  less 

than  Tdk.  The  output  can  be  generated  by  toggling  the  output  after  each  delay  of  N  reference  clock 
periods  plus  R*Tdk.  For  example,  if  Tdk  is  200  ps  (5  GHz)  and  the  desired  output  period  T  is  920  ps 
(1.0869  GHz),  then  N  =  2  and  R  =  0.3.  The  output  is  toggled  at  times  0,  460  ps,  920  ps,  1380  ps,  etc.  as 
shown  in  Figure  A-2  so  that  the  output  frequency  will  be  4.6*  Tdk. 

An  accumulator  is  used  to  determine  the  fractional  delays.  The  carry  output  of  the  fractional  delay 
accumulator  signals  an  extra  cycle  of  delay  Tdk.  A  10-bit  value  of  N  allows  the  clock  period  to  range  from 
Tdk/1024  to  Tdk  or  from  2.44  MHz  to  2.5  GHz  with  Tdk  =  200  ps.  A  24-bit  delay  accumulator  provides  a 
150  Hz  frequency  resolution  and  a  7-bit  vernier  delay  line  provides  picosecond  resolution  with  Tdk  =  200 
ps.  The  output  frequency  can  be  changed  instantly  by  changing  the  input  control  word.  Figure  A-3  shows 
a  block  diagram  of  the  DCS. 

In  the  DCS,  the  output  of  the  DA  controls  a  vernier  delay.  If  the  PA  is  implemented  with  only  full 
adders  with  no  carry  propagation,  the  sum  consists  of  the  sum  of  two  words,  the  output  sum  plus  two 
times  the  carry  output.  If  these  two  words  control  delay  lines  in  series,  the  delay  will  be  the  same  as  if  the 
complete  sum  was  generated.  Figure  A-4  shows  a  block  diagram  of  the  novel  phase  accumulator  and 
delay  lines.  This  approach  enables  the  high-speed  operation  at  5  and  reduces  the  area  and  power 
dissipation.  The  CMOS  accumulator  was  designed  and  simulated  to  operate  at  6.25  GHz  at  80  mW.  The 
power  dissipation  would  reduce  to  about  48  mW  without  the  triple  mode  redundancy  in  the  upper  8-bits 
for  single  event  upset  hardening.  Total  ionizing  dose  hardening  is  achieved  using  reverse  body  bias. 

The  vernier  delays  are  made  from  inverters  and  multiplexers.  For  fine  resolution,  the  loads  on 
some  inverters  are  varied  by  switching  in  different  capacitors.  Figure  A-5  shows  the  delay  versus  a  7  bit 
control  for  4  different  multiplexer  selections.  It  takes  time  for  changes  in  the  delay  line  input  to  settle  and 
clock  pulses  traveling  in  the  delay  lines  during  changes  may  not  give  the  proper  delays.  To  overcome  this 
problem,  clock  pulses  are  sent  alternately  to  two  different  delay  lines. 

The  delay  lines  need  to  be  calibrated  for  linearity  and  to  set  the  maximum  delays  equal  to  Tdk. 

The  DCS  is  designed  so  that  calibration  can  proceed  without  interrupting  normal  operation.  An  extra 
delay  line,  making  a  total  of  three  delay  lines,  is  incorporated  on  chip  so  that  it  can  be  switched  in  for  the 
delay  line  to  be  calibrated. 

The  design  and  layout  of  the  major  components  has  been  completed.  Figure  A-6  shows  the  layout 
of  the  delay  accumulator,  delay  lines  and  counters.  Table  A-l  gives  the  target  specifications. 
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Summary 

The  implementation  of  a  digitally  controlled  synthesizer  that  operates  in  simulation  at  5  GHz  in 
the  IBM  90  nm  process.  A  digital  counter  is  used  to  set  the  period  of  the  output  signal  to  be  an 
integer  number  of  reference  clocks  while  the  time-to-delay  accumulator  creates  an  interpolated 
value  between  clock  transitions  by  delaying  the  output.  The  system  has  an  on-chip  calibrator  to 
linearize  the  analog  delay  lines,  which  have  about  1  ps  resolution.  The  DCS  frequency  is 
digitally  controlled  and  the  output  frequency  can  immediately  change  frequencies.  The  DCS  has 
a  wide  tuning  range  and  small  area.  Because  the  DCS  uses  delay  lines  there  is  no  jitter 
accumulation.  The  jitter  is  nearly  the  same  as  the  fixed  clock  reference  plus  a  few  picosecond 
residual  non-linearity  in  the  delay  lines  after  calibration. 


Figure  A-2.  Block  diagram  of  digitally-controlled  clock  synthesizer 
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Figure  A-3.  Timing  diagram  showing  the  transition  of  the  output  signal  controlled  by  the 
frequency  divide  and  delay  accumulator 
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Period  Control  Word 


Figure  A-4.  Block  diagram  of  delay  accumulator  and  vernier  delays 
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Figure  A-5.  Delay  of  a  5  GHz  4-stage  CMOS  delay  line  versus  a  7-bit  control  signal  at  4  different  multiplexer 

values 


Figure  A-6.  Layout  of  delay  accumulator,  delay  lines  and  counters 
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Table  A-l.  DCS  Specifications 


Characteristic 

Power  Consumption 

124  mW  w/o  calibration  circuitry 

Maximum  Frequency 

5.5  GHz 

Frequency  Tuning  Range 

2.44  MHz  to  2.5  GHz 

Frequency  Resolution 

150  Hz  with  24  bit  Delay  Accumulator 

Layout  Area 

0.8  mm  x  0.36  mm 
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Low  Power  Integrated  GPS  Front  End 

Adam  Heiberg 


Research  Summary 

As  a  recipient  of  the  CDADIC/AFRL  fellowship,  I  have  had  the  opportunity  to  work  on  the 
development  of  an  ultra  low  power  GPS  receiver  front  end.  This  front  end  combines  low  power, 
excellent  flexibility,  and  a  high  degree  of  integration,  enabling  a  new  generation  of  compact,  inexpensive 
precision  location  sensors  to  be  developed.  The  motivating  application  for  the  technology  discussed  here 
is  long  duration  remote  location  tracking  of  small  wildlife. 

Since  the  early  1990s,  the  NAVSTAR  GPS  system  has  provided  precision  location  sensing  free 
of  charge  to  anyone  with  a  clear  view  of  the  sky.  Typical  handheld  GPS  receivers  are  bulky  and  power 
hungry,  requiring  large,  expensive  lithium  ion  batteries.  The  newest  generation  of  GPS  receivers  has 
been  utilized  in  GPS  enabled  wrist  watches.  However,  these  applications  still  require  a  lithium  ion 
battery  which  must  be  recharged  after  only  a  few  hours  of  use.  As  a  result,  they  cannot  be  used  in  remote 
sensing  applications  where  long  battery  life  and  low  cost  are  required.  To  realize  high  precision  remote 
location  sensors,  a  new  low  power  GPS  receiver  is  needed. 

The  recent  rise  to  prominence  of  RFCMOS  techniques  and  technology  opens  new  doors  in  the 
design  of  fully  integrated,  low  power  wireless  receivers,  enabling  the  realization  of  ultra  low  power 
integrated  receivers  which  are  well  suited  to  cost  and  power  constrained  applications.  The  research 
discussed  here  has  focused  on  the  development  and  implementation  of  an  ultra  low  power  integrated  RF 

front  end  for  GPS.  The  fundamental  requirements  of  the 
design  are  low  power  and  compact  size. 

To  realize  the  most  compact  and  economical 
implementation,  the  receiver  must  utilize  a  high  degree  of 
integration.  Figure  A-7(a)  shows  a  typical  GPS  receiver. 

There  are  a  number  of  external  components  present  in  this 
system.  These  include  an  external  LNA  in  the  active  antenna, 
a  matching  network,  and  decoupling  capacitors.  In  the  interest 
of  reduced  cost  and  size,  it  is  vital  to  either  eliminate  the 
external  components  from  the  system  or  move  them  inside  the 
receiver  IC. 

In  the  design  discussed  here,  all  external  components  in  the 
typical  receiver  realization  have  been  moved  on  chip  as  shown 
in  Figure  A-7(b).  This  presents  some  special  design 
requirements.  The  elimination  of  the  external  LNA  that  is 
typically  found  in  the  antenna  module  requires  that  a  high 
performance  LNA  be  placed  in  the  receiver  IC.  In  addition, 
the  external  matching  network  is  eliminated,  and  an 
impedance  match  is  realized  on  chip.  On  chip  decoupling  capacitors  and  fully  differential  circuits  are 
used  to  reduce  supply  noise  sensitivity  and  eliminate  the  need  for  large  external  decoupling  capacitors. 

The  GPS  standard  has  a  unique  set  of  requirements.  Signal  power  in  the  carrier  is  extremely  low, 
relaxing  linearity  requirements  and  calling  for  the  highest  possible  gain.  In  addition,  the  spectrum  around 
the  GPS  center  frequency  is  empty  for  approximately  10  MHz  on  either  side  of  the  carrier.  As  a  result, 
intermodulation  distortion  and  images  are  not  serious  concerns,  being  easily  dealt  with  in  the  analog 
baseband  portion  of  the  system.  In  spite  of  the  low  signal  power  present  at  the  receiver’s  antenna,  noise 
figure  requirements  are  not  stringent  since  the  GPS  signal  is  spread  spectrum  coded,  providing  over  40  dB 
of  processing  gain  and  easing  the  sensitivity  requirement  for  the  front  end.  One  additional  consideration 


Decoupling 


(b) 

Figure  A-7.  a)  Typical  GPS  receiver, 
b)  Integrated  GPS  receiver 
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is  that  GPS  utilizes  quadrature  modulation,  requiring  that  the  receiver  design  include  quadrature  mixers 
and  oscillators. 


RF  Front  End  Figure  A-8  is  a  diagram  of  the  receiver  front  end  that  has 

been  the  focus  of  this  work.  The  design  is  fully  integrated 
as  discussed  previously.  The  antenna  can  be  connected 
directly  to  the  chip  with  no  external  parts  required  for 
interfacing.  An  LNA  provides  input  matching,  gain,  and 
isolation.  A  pair  of  mixers  and  a  quadrature  oscillator 
provide  additional  gain  and  translate  the  incoming  signal  to 
a  lower  frequency  for  baseband  signal  processing.  Each  of 
the  blocks  in  the  system  has  been  optimized  for  ultra  low 
voltage  operation  in  the  interest  of  reducing  power 
consumption  and  ensuring  compatibility  with  future  low 
voltage  process  technology.  The  minimum  supply  voltage 
for  the  system  is  250  mV. 

A  two  stage  LNA  is  used  to  achieve  the  best  possible 
Figure  A-8.  Receiver  topology  gain.  got]1  s^ages  are  neutralized  for  excellent  reverse 

isolation.  A  wideband  input  match  is  realized  to  eliminate  the  need  for  external  components  to  tune  the 
match  frequency.  The  LNA  incorporates  a  variable  gain  feature  to  ensure  optimal  system  performance 
regardless  of  the  incoming  power  level.  For  the  best  possible  low  voltage  operation,  the  mixer  has  been 
implemented  without  stacking  devices.  In  addition,  the  mixer-oscillator  interface  has  been  realized 
without  a  LO  buffer,  reducing  power  consumption  and  improving  low  voltage  operation.  The  oscillator  is 
a  differential  Colpitts  design,  giving  the  best  possible  phase  noise  performance  and  providing  excellent 
rejection  of  supply  noise,  eliminating  the  need  for  large  decoupling  capacitors. 

Table  A-2  includes  detailed  measured  results  for 
the  RF  front  end.  In  addition,  a  comparison  with  the  best 
recently  published  work  is  provided.  This  work  exhibits  an 
order  of  magnitude  lower  power  consumption  than  any 
previously  published  work,  enabling  the  realization  of 
compact,  inexpensive  location  sensors  [A-l,  A-2,  A-3]. 


Figure  A-9.  Die  Photo 
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Table  A-2.  Measured  results  and  comparison  to  prior  work 


[i! 

[2] 

[3] 

This  Work 

Vul,: 
400  mV 

V„n- 
300  mV 

VDLy. 
250  mV 

VCO  Phase  Noise  8  1  MHz  [dB.  /IIz] 

-104 

-118 

-122* 

-113.8 

-112.5 

-112,4 

VCO  'hilling  Range  [%] 

11 

11 

11 

Nominal  Conversion  Gain  [dB] 

36 

25,8 

10 

42.5 

42.2 

41.8 

Gain  Adjustment  Ilunge  [dB 

27.5 

27.2 

26.8 

SSB  Noise  Figure  [dB] 

4.8** 

2.7  (DSR) 

6** 

8.6 

9:2 

9.6 

|S1 1 1  @  Center  Frequency  [dB] 

-20 

-30 

-16 

-16 

-16 

10  dB  Bandwidth  of  |S11|  [MHz] 

210 

250 

935 

935 

935 

IIP 3  [dB m] 

-19 

-14.5 

-0 

-35 

-35-2 

-35.8 

1  dB  Compression  Point  [dBm] 

-27.6 

-IS 

-45.7 

-47 

-4S 

LO  to  RF  Leakage  [dBm] 

-55 

-105 

-80,7 

-81 

-81,75 

Process  Technology 

0.13pm 

CMOS 

0.35pm 

BiCMOS 

0.25pm 

CMOS 

0.13/un 

CMOS 

Power  Consumption  [mW] 

5.4 

41.3 

9.6 

0.586 

0.405 

0.352 

*  600  KHz  Offset 


**  Not  Specified  its  DSii  or  SSli 
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Low  Power  BAW-Based  PLLs/Tunable  LNAs 

Julie  Hu 


Research  Summary 

I  sent  to  fabrication  a  CMOS  analog  FBAR-based  PLL  in  spring  2008. 1  dedicated  my  entire  summer  to 
the  most  exciting  phase  of  the  project  -  testing. 

Figure  A-10  shows  the  integer-N  type  II  PLL  architecture.  The  design  was  fabricated  in  0.13pM  CMOS 
process.  For  comparison,  an  identical  LC  PLL  was  also 
implemented  on  the  same  die.  Particulars  about  the  FBAR  PLL 
include: 


a.  It  has  very  low  phase  noise  and  very  low  jitter 

b.  It  has  low  reference  spurs 

c.  It  runs  at  extremely  low  power 

d.  It  has  a  limited  tuning  range 

e.  It  has  a  low  bandwidth 

All  of  the  above  were  directly  derived  from  deploying  a  high  Q 
(>2000)  FBAR  resonator.  It  can  be  demonstrated  that  the  FBAR 
VCO  has  a  power/phase-noise  flgure-of-merit  over  30  dB  better 

than  LC  oscillators.  As  a  result,  the  FBAR  PLL  was  expected  to 
have  a  30  dB  or  better  phase  noise  improvement  at  high  offset 


Chip  under  test 


Figure  A-ll.  The  test  bench  Figure  A-12.  Die  photo 


Figure  A-10.  4th  order  type  II  FBAR- 
based  PLL  architecture.  The  FBAR 
and  LC  implementations  are  identical 
except  for  the  VCO  tank 

frequencies  over  the  LC  counterpart 
under  the  same  level  of  power 
consumption. 

The  center  frequency  of  the  PLL  is 
1.575  GHz.  Such  a  high  frequency 
output  prohibits  using  chip  packages 
because  of  the  requirement  on  output 
impedance  matching.  Therefore,  the 
die  was  wire-bonded  on  the  PCB 
directly  (Figure  A-ll).  The  zoomed 
view  on  how  the  die  was  mounted  on 
the  board  is  shown  in  Figure  A-12.  The 
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FBAR  resonator  was  wire -bonded  to  the  VCO  side-by-side  with  the  CMOS  chip. 

The  VCO  has  an  on-chip  open-drain  buffer  at  the  output.  The  on-board  design  (Figure  A- 13)  assures  that 
the  output  impedance  to  the  SMA  is  50  ohms  and  the  signal  is  AC  coupled  to  the  instrument  (SA).  The 
input  to  the  PLL  is  the  crystal  reference,  which  is  less  than  50  MHz.  AC  coupling  is  provided  on-chip. 
The  on-board  design  provides  a  50  ohm  input  impedance  matching  (Figure  A-14).  In  addition,  the  board 
provides  regulated  DC  power  supplies  (Figure  A- 15).  The  voltage  regulator  takes  3.6  V  as  input  and 
generates  a  DC  voltage  between  1.0  V  to  1.5  V,  determined  by  the  potentiometer.  A  variable  DC  voltage 
facilitates  exploration  of  various  operating  conditions  of  the  chip. 
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Figure  A-13.The  VCO  output  Figure  A-14.  The  reference 

input 


Figure  A-15.  DC  supply  regulation 


Figure  A-16.  Serial  data  input 


Built  on  the  chip  is  a  serial  data  interface  for  programming  the  internal  shift  registers.  Ardvark  I2C/SPI 
from  Total  Phase  was  used  for  this  purpose.  The  device  is  driven  from  a  laptop  computer  running  a 
customized  program  in  Python  which  generates  a  clock  signal  together  with  a  data  signal.  Since  the 
digital  output  level  from  the  device  is  3.3  V,  the  on-board  circuit  provides  a  level  shift  to  the  chip  Vdd 
level  (Figure  A-16).  The  reset  signal  originates  from  a  push  button  which  triggers  a  momentary 
normally-off  SPST  switch. 
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The  chip  operates  at  a  supply  voltage  of  1 .0  V  and  a  total  power 
dissipation  of  750  pW  (500  pW  for  the  VCO  and  250  pW  for  the  rest  of 
the  circuitry,  including  divider,  divider  buffer,  CP,  PFD,  and  crystal 
buffer).  The  measured  settling  time  of  the  FBAR  PLL  is  600  ps  with  a  CP 
current  of  10  pA  (Figure  A- 17).  The  measured  tuning  range  of  the  PLL  is 
1.3  MHz. 
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Figure  A- 18  shows  the  phase  noise  of  the  locked  and  free-running  FBAR 
VCO  measured  with  an  Agilent  E5052B  signal  source  analyzer.  An 
external  45  MHz  crystal  oscillator  is  used  as  the  reference  with  a  divide 
ratio  N=35  (frf=1575  MHz).  The  measured  crystal  reference  phase  noise, 
scaled  by  20Tog(N)  dB,  is  plotted  to  illustrate  the  reference  noise 
contribution. 


Figure  A-17.  Measured 
settling  time 

Reference  spur  rejection  is  enhanced  by  the  low  loop  bandwidth  and 
careful  charge  pump  design.  The  measured  reference  spur  level  of  the  FBAR  PLL  is  -77  dBc.  The 
integrated  RMS  jitter  from  10  kHz  to  10  MHz  is  0.6  ps  for  the  FBAR  PLL  and  20  ps  for  the  LC  PLL. 
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Figure  A-18.  Measured  reference  spurs 


Summary 

In  summary,  summer  2008  was  a  fruitful  season.  I  designed  the  test-bench  and  successfully 
performed  the  measurements  and  verifications  of  the  FBAR  PLL.  The  measured  PLL  performance  is 
consistent  with  theoretical  predictions.  Thus,  it  can  be  concluded  that  the  first-ever  FBAR-based  PLL  is 
alive,  having  30  dB  lower  phase  noise  comparing  to  the  LC  PLL  dissipating  the  same  amount  of  power. 
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High-Speed,  Low-Noise  ADC  in  SiGe 

Dirk  Robinson  and  Dr.  George  S.  La  Rue 


Introduction 

Analog-to-Digital  Converters  (ADCs)  with  high  speed  and  high  accuracy  play  a  critical  role  in 
electronics  testing,  scientific  instrumentation,  and  digital  communication  systems.  The  performance  of 
ADCs  in  this  performance  comer  are  limited  by  the  input  Track-and-Hold  Amplifier  (THA)  that  is  used. 
Consequently,  this  input  THA  must  have  excellent  thermal  noise  and  jitter  performance. 

Past  THA  designs  were  optimized  for  low  signal  distortion  at  the  expense  of  voltage  headroom, 
power  consumption,  thermal  noise,  and  bandwidth.  Current  technology  scaling  makes  it  desirable  to 
allow  increased  signal  distortion,  which  will  be  compensated  for  in  the  digital  domain.  With  the  lower 
voltage  headroom  limitations  of  modem  processes,  digital  distortion  compensation  will  become  essential 
in  order  to  achieve  ever  higher  sampling  rates  at  high-accuracy. 

The  purpose  of  this  research  was  to  design  and  fabricate  an  ADC  using  a  THA  based  on  this 
principle.  Digital  distortion  compensation  methods  unique  to  the  architecture  of  this  ADC  were  also 
investigated.  The  project  was  supported  by  CDADIC  for  the  first  2  years  and  the  fellowship  supported  the 
measurements  and  development  of  the  distortion  compensation  techniques. 

High-Speed,  Low-Noise  ADC  Architecture 

The  ADC  uses  a  four  stage  pipelined  design  with  four  bits  per  stage  and  a  one-bit  overlap 
between  stages  to  allow  for  error  correction,  as  shown  in  Figure  A- 19.  We  found  that  the  nonlinearities  of 
all  of  the  ADC  components  could  be  grouped  into  two  nonlinear  blocks  placed  in  the  first  stage. 

The  first  stage  THA  in  our  design  (Figure  A-20)  is  based  on  PMOS  switches.  For  increased  hold-mode 


Figure  A-19.  Pipeline  ADC  Showing  Lumped  Nonlinearities 


isolation,  we  use  two  transistors  in 
series,  and  also  short  the  intermediate 
nodes  together  in  hold  mode.  To 
reduce  charge  injection,  we  also  use 
dummy  switches  driven  out  of  phase 
from  the  main  switches. 

The  two  main  switches  and 
the  dummy  switches  are  all  signal¬ 
following  in  track  mode  and  constant 
in  hold  mode  to  reduce  distortion.  The 
gate  driving  circuit  which 
accomplishes  this  is  shown  in 


Main  Switch  Dummy  Switch  Output  Ciamps 
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Figure  A-21.  ECL  Maximum 
Circuit 


Figure  A-22.  THA  Gate  Driving  Circuitry 


Figures  A-21  and  A-22.  This  circuit  is  implemented  with  bipolar  transistors  to  achieve  low  sampling  time 
jitter.  A  simple  maximum  value  circuit  was  used  as  the  basis  for  signal  tracking  during  track  mode,  shown 
in  Figure  A-22. 

Digital-Domain  Distortion  Compensation 

We  investigated  several  linearity  correction  schemes  in  the  digital  domain.  We  found  that  in  order 
for  digital  compensation  to  work  over  a  wide  bandwidth,  it  was  necessary  to  use  compensation  models 
that  depended  on  the  slope  of  the  analog  input.  For  this  work,  we  estimated  the  slope  from  the  ADC 
digital  output.  Without  distortion  compensation,  the  output  distortion  ranged  from  -47  to  -52  dB  for 
frequencies  between  10  MHz  to  210  MHz.  Compensation  improved  distortion  to  the  -54  to  -57  dB  level. 

Summary 

The  Bipolar-Driven  MOS  Switch  THA  developed  in  this  research  has  important  benefits  to  high¬ 
speed  ADCs.  First,  it  allows  operation  with  relatively  low  power  supply  voltage  range.  Second,  since  the 
main  switch  acts  like  a  resistor,  the  added  noise  is  minimal.  Also,  the  switches  act  as  part  of  a  low-pass 
filter  to  further  limit  noise.  Finally,  since  the  sampling  time  jitter  is  limited  by  bipolar  switches,  the  jitter 
performance  is  much  improved  over  what  can  be  achieved  using  only  MOS  devices. 

In  the  design  of  an  ADC  using  the  BDMS-THA,  we  did  only  minimal  distortion  compensation 
with  analog  circuitry.  The  only  remaining  distortion  compensation  techniques  in  the  analog-domain  are 
signal-tracking  at  the  MOS  gates  of  the  THA  in  track  mode  and  digital  trim  for  the  4-bit  flash  ADC  in 
each  pipeline  stage.  As  much  compensation  as  possible  was  moved  into  the  digital  domain.  The 
compensation  was  applied  by  computer  processing  after  the  records  were  acquired.  This  allowed  us  to 
fabricate  the  chip  before  developing  the  compensation  algorithms,  and  allowed  us  to  evaluate  the 
performance  of  several  compensation  schemes. 

Using  the  distortion  compensation  models  we  developed,  the  BDMS-ADC  shows  a  Signal-to- 
Noise-And-Distortion  performance  54  to  57  dB  over  an  input  bandwidth  of  200  MHz,  corresponding  to 
around  9.5  effective  bits.  This  makes  the  ADC  competitive  with  ADC  designs  using  GaAs  technology. 
However,  since  the  BDMS-ADC  was  designed  with  a  SiGe  process,  it  has  a  much  lower  fabrication  cost. 
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Stochastic  and  Passive  A/D  Techniques  for  Submicron  CMOS 

Skyler  Weaver 


Research  Summary 

Flash  ADCs  typically  use  some  sort  of  reference  ladder  to  generate  the  comparator  trip  points  that 
correspond  to  each  digital  code.  A  stochastic  ADC  uses  device  mismatch  to  generate  these  trip-points. 
Consider  a  large  array  of  identically  drawn  comparators,  each  with  a  random  input-referred  offset. 
Individual  offsets  are  unknown,  as  they  are  random,  but  the  overall  offset  distribution  can  be  defined  by 
its  probability  density  function  (PDF).  If  all  of  these  comparators  are  connected  in  parallel,  i.e.  their 
inputs  are  all  connected  as  in  Fig.  A-23(b),  and  a  linear  ramp  is  applied  at  the  input,  a  plot  of  the  number 
of  comparators  that  evaluate  high  against  the  input  will  follow  the  cumulative  density  function  (CDF) 
which  is  merely  the  integral  of  the  PDF  as  depicted  in  Fig.  A-23.  If  comparator  offset  follows  a  Gaussian 
distribution  or  other  distribution  with  a  near  linear  CDF,  then  the  CDF  can  be  used  as  the  transfer  function 
without  calibration. 

A  test  chip  was  fabricated  in  Jazz  0.18pm  BiCMOS  with  a  total  area  of  5.76  mm2.  Increasing  the 
number  of  active  comparators  yields  a  measured  increase  in  ENOB  calculated  from  SNDR.  This  indicates 
that  linearity  continues  to  increase  as  a  function  of  the  number  of  comparators;  however,  enabling  more 
than  1152  comparators  for  Gaussian  nonlinearity  reduction  does  not  yield  any  additional  observed 
improvement. 

Since  area  and  power  scale  linearly  with  the  number  of  comparators,  it  was  chosen  to  enable  only 
1152  comparators  to  demonstrate  the  concept  and  obtain  additional  measurement  results;  thereby 
reducing  the  effective  active  area  to  0.43  mm2. 

Since  the  digital  cell  comparators  are  made  up  of  minimum  sized  transistors,  the  standard  deviation 
(a)  of  comparator  offset  is  expected  to  be  quite  large.  In  fact,  measurement  shows  that  for  our  test  setup 
with  supply  voltage  of  900  mV,  a  -  140  mV.  Because  the  signal  range  is  -a  to  +a,  the  resulting  signal 
range  is  280  mV.  With  comparator  offsets  of  such  magnitude,  it  would  be  difficult  to  obtain  any 
resolution  with  conventional  circuit  techniques.  The  active  comparators  are  divided  into  two  groups  of 
576  comparators  each  and  given  fixed  differential  references  of -a  and  +o.  A  1  MHz  sine  input  is  applied 
and  ENOB  calculated  from  SNDR  is  above  4.9b  up  to  18MS/s.  There  is  an  abrupt  drop  in  ENOB 
observed  beyond  18MS/s  is  due  to  ripple-carry  adders  not  having  enough  time  to  resolve,  thus  causing 
gross  digital  errors.  By  designing  a  faster  adder  tree  it  should  be  possible  to  achieve  higher  sampling 
rates. 

The  Gaussian  nonlinearity  reduction  can  be  best  seen  by  this  example:  With  all  1152  comparators 
acting  as  a  single  parallel  group,  i.e.  their  inputs  are  connected  and  references  are  connected,  sweeping  the 
input  with  a  linear  ramp  reveals  a  transfer  function  that  is  resembling  a  Gaussian  CDF.  SNDR  of  25.1  dB 
is  achieved  with  a  1  MHz  input  and  sampling  frequency  of  8.192  MHz.  Using  the  exact  same  comparators 
under  the  same  conditions,  but  merely  dividing  them  into  two  groups  with  differing  references,  an  8.5  dB 
improvement  in  SNDR  was  observed. 

Power  consumption  for  the  analog  portion  is  1 82pW.  Digital  power  is  scaled  to  reflect  the  amount 
that  is  related  to  the  number  of  active  comparators.  Digital  power  consumed  by  disabled  portions  of  the 
chip  is  not  included.  Digital  power  is  then  found  to  be  449pW  with  188pW  consumed  by  clock  drivers, 
leaving  261pW  consumed  by  the  pipelined  ripple-carry  adder  tree. 
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Test  Chip 


Performance  Summary  (fs  =  8 MHz ,fn  =  1MHz) 

Technology 

0.18/un  CMOS 

DNL 

-0.38/ +0.50  LSB 

Resolution 

6b 

INL 

-1.06/ +1.07  LSB 

Max  Sampling  Rate 

18MS/s 

Analog  Power 

1 82// W 

Supply  Voltage 

900mV 

Digital  Adder  Power 

261/fW 

Comparator  Offset 
Standard  Deviation 

140mV 

Clock  Driver  Power 

ISSjuW 

Input  Range 

280mVpp 

(differential) 

Total  Power 

631/fW 

SNDR  /  SFDR 

33.59  dB/ 42.86  dB 

Core  Active  Area 

0.43  mm2 
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Figure  A-23.  a)  Probability  density  function  of  comparator  offset  in  terms  of  standard 
deviation,  c,  assuming  Gaussian  distribution,  b)  1024  comparators  connected  in  parallel  witl 
single,  fixed  reference  and  a  ramp  input  c)  Output  of  1024  comparators  with  ramp 

input  in  terms  of  c 
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Digitally  Enhanced  High-Speed  Links 

Brian  Young 


Research  Summary 

My  research  areas  fhave  focused  on  two  areas:  Digitally-Enhanced  Multiplying  Delay-Locked 
Loops  (MDLL),  and  Delta-Sigma  Modulator  based  Time-to-Digital  Converters  (TDC).  The  attractiveness 
of  multiplying  delay-locked  loops  (MDLL)  when  compared  with  multiplying  phase-locked  loops  (MPLL) 
has  increased  in  recent  years  due  to  its  low  phase  noise  potential  combined  with  its  frequency  synthesis 
ability.  Considering  the  case  of  ring  oscillator  base  voltage-controlled  oscillators  (VCO),  the  MDLL 
resets  jitter  accumulation  in  the  VCO  by  breaking  the  VCO  loop  and  replacing  the  VCO  edge  with  a 
coincident  reference  edge.  If  the  mean  output  phase  is  the  same  as  the  reference  edge  phase,  the  output 
phase  noise  is  reset  to  zero  and  no  deterministic  jitter  is  introduced.  This  can  be  modeled  as  an  increase  in 
loop  bandwidth,  and  reduces  the  VCO  phase  noise  contribution.  In  practice,  the  mean  output  phase  and 
reference  edge  phase  are  not  perfectly  aligned  due  to  phase  detector  and  charge-pump  offsets  among 
others.  This  phase  misalignment  leads  to  one  output  cycle  period  being  different  than  the  other  N  minus  1 
cycle  periods,  where  N  is  the  integer  feedback  divide  value,  which  leads  to  large  deterministic  jitter. 
MPLL  and  MDLL  circuits,  designed  in  modern  deep  submicron  processes,  commonly  use  MOS  gate 
capacitance  as  loop  filter  capacitors  because  of  its  superior  capacitance  per  unit  area;  however,  the  gate 
oxide  of  these  devices  cannot  reliably  be  considered  to  have  infinite  DC  impedance  and  thus  have  small 
leakage  currents.  These  currents  can  cause  increased  control  voltage  ripple  and  higher  output  jitter. 

To  combat  these  problems,  we  have  been  investigating  digital  enhancements  to  MDLL  and 
MPLL  analog  circuits  such  as  a  TDC  to  replace  traditional  phase  detectors  and  phase/frequency  detectors, 
and  digital  loop  filter  architectures  to  replace  the  charge-pump  and  analog  loop  filter.  The  digital  loop 
filter  removes  the  static  phase  offset  and  control  voltage  ripple  issues  of  charge-pump  analog  loop  filter 
MDLLs  and  MPLLs.  Since  the  digital  loop  filter  adds  quantization  noise,  the  designer  must  carefully 
consider  the  tradeoffs  of  increased  digital  resolution  in  succeeding  DACs. 

The  TDC  facilitates  the  use  of  digital  loop  filters  in  digitally-enhanced  MDLLs  and  MPLLs  by 
performing  the  time  (or  phase)  to  digital  conversion  and  providing  a  digitally  coded  signal.  The  simplest 
TDC  is  the  bang-bang  TDC,  which  is  simply  a  D  flip-flop  that  samples  the  reference  edge  with  the 
feedback  edge.  It  can  only  determine  the  sign  of  the  phase  error  (positive  or  negative)  and  not  the 
magnitude  of  the  phase  error.  Other  more  complex  TDCs  function  similar  to  flash  ADCs.  A  flash  ADC 
references  a  series  of  unit  voltages  or  currents  to  determine  the  input  signal  magnitude,  whereas  the  flash 
TDC  references  a  unit  delay  element  (usually  the  propagation  delay  of  an  inverter)  to  determine  the  input 
time/phase  magnitude. 

Considering  the  flash  TDC  to  be  an  ADC  that  processes  signals  in  the  phase  domain,  the 
observation  can  be  made  that  the  high  signal  bandwidths  of  the  flash  TDC  are  excessive  for  use  in  a 
MPLL  where  loop  bandwidths  are  typically  less  than  ten  times  the  reference  frequency.  In  this  case, 
oversampling  and  noise  shaping  techniques  such  as  delta-sigma  modulation  become  attractive. 

When  choosing  the  process  signals  in  the  phase  domain,  the  choice  between  discrete-time  and 
continuous-time  modulators  is  not  so  clear.  The  straightforward  method  is  to  use  a  phase-detecting 
transconductor  along  with  a  current  reference  to  provide  modulator  feedback;  however,  this  method 
suffers  from  the  highly  non-linear  operation  of  the  phase  detector  when  both  the  reference  and  feedback 
signals  have  similar  phase.  We  have  developed  a  technique  whereby  a  digital-to-phase  converter  (DPC) 
allows  the  phase-detecting  transconductor  to  operate  within  its  linear  region,  greatly  improving  the 
linearity  and  distortion  performance  of  this  TDC.  This  leads  to  the  delta-sigma  modulator  loop  filter 
having  characteristics  of  both  discrete-time  and  continuous-time  systems. 

I  worked  on  analyzing  digitally-enhanced  MDLL  and  MPLL  circuit  techniques.  The  majority  of 
my  time  and  effort  has  been  spent  theorizing,  modeling,  and  designing  a  time-to-digital  converter  which 
processes  signals  in  the  phase  domain  with  improvements  in  linearity  and  distortion  performance.  Tape- 
out  of  this  TDC  will  occur  in  Lebruary  with  prototype  testing  to  begin  Q3  2009. 
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Offset  Bias  Voltage  Driven  Bulk  Enabled  Downconversion  Mixer 

for  Radio  Frequencies 

Jeremy  Asmussen 


Abstract 

This  paper  describes  a  mixer  simulated  in  cadence;  designed  in  CMOS  0.18  um  technology, 
which  uses  the  bulk  terminal  of  the  transistor  as  one  of  the  inputs.  This  method  allows  for  the  radio¬ 
frequency  (RF)  and  local  oscillator  (LO)  stages  in  traditional  switching  mixers  to  be  collapsed  into  one 
stage,  thereby  allowing  for  operation  at  lower  supply  voltages  and  lower  power  consumption  levels.  This 
also  allows  for  smaller  chip  design. 

The  mixer  operates  in  the  2.4  GHz  band  and  downconvert  to  a  100  MHz  intermediate-frequency 
(IF)  signal.  The  mixer  will  consume  1.4  mW  while  achieving  a  voltage  conversion  gain  of  at  least  12  dB. 
The  single  sideband  (SSB)  noise  figure  of  the  mixer  will  be  no  more  than  30  dB.  The  mixer  will  achieve 
an  input-referred  3rd-order  intercept  point  of  6  dBm. 

Results  and  Discussion 

Recently  there  has  been  an  emergence  of  wireless  applications  over  the  past  several  years  in 
sensors,  medical,  and  wireless  telecommunications.  Due  to  this  there  is  a  huge  increase  in  research  in 
low-power  radio-frequency  integrated  circuit  (RFIC)  design.  The  need  for  low-power  is  greater  due  to  the 
fact  that  chips  are  needed  to  be  smaller  and  are  expected  to  operate  for  extended  periods  of  time  on  a 
single  battery  [A-4,  A-5].  Active  mixers  can  provide  gain  as  well  as  frequency  translation,  with  smaller 
noise  figure  than  passive  mixer,  but  will  not  have  as  high  linearity. 

Mixers  are  important  to  any  radio-frequency  integrated  circuit  (RFIC).  The  basics  of  mixers  is  a 
trade-off  between  voltage  conversion  gain,  input-referred  3rd  order  intercept  point  (IIP3;  linearity),  single 
sideband  (SSB)  noise  figure,  minimum  voltage  supply,  and  power  consumption.  The  mixer  discussed  in 
this  paper  is  the  Gilbert  cell  double-balanced  mixer;  this  is  the  most  common  active  mixer  used  in  today’s 
technologies.  The  use  of  bulk  enabled  mixer  which  has  become  more  used  in  recent  years  will  lower  the 
operating  voltage  requirements  of  the  mixer  [A-6,  A-7]. 

Also,  the  benefits  of  using  a  CMOS  process,  along  with  the  ease  of  integrating  into  a  circuit,  lend 
themselves  well  to  such  applications  [A-5,  A-8].  What  follows  is  the  operating  principle,  design,  post 
layout,  and  simulation  results  of  the  bulk  enabled  mixer  at  2.4  GHz  are  presented  below  to  meet  all  the 
specifications. 

Bulk  Effect 

The  mixers  used  in  this  paper  depend  on  the  bulk  effect,  which  changes  the  threshold  voltage 
(Vth).  The  threshold  voltage  is  a  function  of  source-bulk  voltage  (Vsb).  The  NMOS  devices  used  are  the 
relationship  of  equation  1.  Where  VthO  is  the  threshold  voltage  when  Vsb  =  0  and  y  is  the  body  constraint, 
and  2(pf  is  the  surface  potential.  Vth,  y,  and  cpf  constants  were  found  in  [9].  The  bulk  effect  also  leads  to 
the  bulk  transconductance,  gmb  =  SID/SVBS,  and  for  small  signal  provides  control  over  threshold  voltage 
Vth  =  f(Vsb),  for  large  signals  (Figure  A-24). 


Vth  =  Vtho  4-  7  +  Vsb  -  v^/] : 

Figure  A-24.  Threshold  Voltage 


Bulk  Driven  Gilbert  Cell  Mixer 

As  indicated  above  the  conventional  Gilbert  mixer  has  3  levels  of  stacked  transistors,  which  will 
decrease  the  voltage  headroom  across  RL.  The  topology  of  the  Bulk  driven  Gilbert  cell  mixer  is  seen  in 
Figure  A-25.  To  improve  the  voltage  headroom,  the  bulk  can  be  used  as  a  terminal  of  the  transistor.  This 
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will  allow  the  gain  and  switching  stages  of  the  Gilbert  mixer  to  be  combined  into  one.  The  bulk  driven 
will  take  the  differential  inputs  at  the  RF  and  LO  (RF  at  the  bulk)  to  improve  the  noise  performance. 


Figure  A-25.  Bulk  Driven  Gilbert  Cell  Mixer 

In  this  circuit  the  LO  signal  is  applied  differentially  to  the  bulk  terminals  of  the  two  sets  of 
transistors,  while  the  RF  signal  is  applied  differentially  to  the  gate  terminals.  This  allows  the  circuit  to 
give  the  highest  voltage  conversion  gain.  As  the  bulk  voltage  changes,  so  does  the  threshold  voltage.  This 
means  in  the  circuit,  the  changing  LO  signal  will  allow  us  to  modulate  the  threshold  voltage  of  the 
transistors.  If  the  LO  signal  swing  is  large  enough,  the  transistors  would  turn  on  and  off  as  a  function  of 
the  RF  signal  similar  to  the  Gilbert  cell  mixer  switching  stage,  this  would  essentially  multiply  the  RF 
signal  by  a  square  wave  of  the  LO’s  signal  frequency  [A- 10]. 

The  use  of  NMOS  transistors  in  this  way  it  is  necessary  to  isolate  the  bulk  terminals  of  each 
transistors  from  each  other  and  all  other  transistors  in  the  circuit.  This  is  needed  to  make  sure  the  bulk 
terminals  to  cross  over.  This  can  achieved  through  triple-well  technology  (using  deep  N-well  option).  The 
Deep  N-well  option  can  be  seen  in  Figure  A-26.  Then  we  can  remove  the  gain  stage  current  source  to 
improve  the  supply  requirements.  The  elimination  of  the  current  source  will  increase  the  sensitivity  of  the 
bias  conditions  to  process,  voltage,  and  temperature  (PVT)  variation  [A-6].  This  can  also  reduce  the 
RF/LO  leakage  through  the  substrate. 


Figure  A-26.  Deep  N-well  layout  of  NMOS  transistor 


Simulation  Results 

The  operating  conditions  and  simulated  performances  of  the  bulk  driven  mixer  are  summarized  in 
Table  A-3  and  the  specified  values  are  found  in  [A-12].  The  mixer  will  operate  with  a  supply  (Vdd)  of  1.8 
V.  I  also  have  1  volt  coming  into  each  chip  through  the  port  to  power  the  transistor  with  a  value  of  1  V. 
The  circuit  itself  will  consume  1.4  mW  of  power. 

Also,  the  voltage  conversion  gain  and  noise  figure  are  a  function  of  LO  bulk  input  terminal.  The 
LO  input  power  was  chosen  to  give  us  the  largest  conversion  gain  and  lowest  noise  figure.  I  also  then 
repeated  these  at  the  slow  comer  and  fast  corner.  These  values  will  give  my  solutions  robustness.  This 
proves  that  even  the  slowest  and  fastest  meet  my  specifications. 
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The  primary  comparison  of  a  mixer’s  performance  is  combination  of  power  consumption, 
conversion  gain,  linearity,  and  noise  figure.  The  figure  of  merit  (FoM)  is  a  value  of  an  expression  that 
attempts  to  incorporate  all  the  important  parameter  values  that  describe  the  performance  of  a  circuit.  This 
value  can  then  easily  be  used  to  compare  designs  [A-12].  The  equation  is  seen  in  Figure  A-27.  Frf  is  the 
frequency  at  the  RF  terminal.  CG  is  the  voltage  conversion  gain.  NF  is  the  noise  frequency,  IIP3  is 
linearity,  and  Pc  is  the  power  consumed. 

FoM  -  20hg(fRF)  +  CG  -  NF  +  II P3  -  10log(Pc) 

Figure  A-27.  FoM  equation 


Table  A-3.  Designed/simulated  values  of  my  mixer  along  with  comparisons 


Designed 

Simulated 

Nen  Bulk 

[1] 

PI 

[3] 

[5] 

H 

PI 

[0] 

Area  of  Interest 

Slow  Corner 

Typical  Corner 

Fast  Corner 

Slow  Corner 

Typical  Corner 

Fast  Corner 

Technology  (um) 

0.18 

0.18 

0.18 

0.18 

0.18 

0.18 

0.18 

0.9 

0.18 

0.8 

0.35 

0.13 

0.18 

0.5 

0.25 

VDD  (V) 

1 

1 

1 

1 

1 

1 

1 

1.2 

0.8 

1.8 

0.9 

1.2 

1 

1 

1 

Power  Consumption  (mW) 

1.4 

1.4 

1.4 

1.4 

1.4 

1.4 

1.4 

1.8 

0.4 

4 

4.7 

4.8 

1 

0.18 

11.5 

LO  Frequency  (GHz) 

2.3 

2.3 

2.3 

2.3 

2.3 

2.3 

2.3 

2  G 

1.65  G 

N/A 

N/A 

2.1375 

N/A 

N/A 

N/A 

RF  Frequency  (GHz) 

2.4 

2.4 

2.4 

2.4 

2.4 

2.4 

2.4 

2.15  G 

1.9  G 

1.9  G 

.9  G 

2.1325 

5.8 

6.9 

1.8 

Voltage  Conversion  Gain  (dB) 

15.36 

15.57 

15.77 

14.62 

14.93 

15.23 

7.79 

3.2 

3 

0.5 

2 

8 

13 

6 

11 

SSB  Noise  Figure  (dB) 

19.96 

20.58 

21.09 

25 

24.62 

17.4 

10 

10.2 

13.5 

9 

17 

21 

14.6 

Input  and  Output  Impedance  Matching  (Ohms) 

50 

50 

50 

50 

50 

50 

50 

N/A 

N/A 

N/A 

N/A 

N/A 

N/A 

LO  to  RF  and  LO  to  IF  Isolation  (Feedthrough)  (dB) 

-104 

-105 

N/A 

>-80 

>-30 

N/A 

N/A 

>80 

29 

N/A 

1  dB  Compression  (dBm) 

-7.9 

r  -7.5 

r  -13.3 

-11 

-15 

-8 

-21 

-13 

-16 

N/A 

IIP3  (dBm) 

N/A 

N/A 

r  -2.1 

N/A 

-6 

3.5 

-11.5 

-4 

-2 

12 

FoM  =  20log(Frf)  +  G  -  NF+ IIP3  -  10log(P) 

206 

191.8 

197.8 

202 

194 

194 

197.27 

217 

217 

200.89 

Kn1  (um) 

47.5 

50 

52.5 

47.5 

50 

52.5 

Irf(mA) 

1.4 

1.4 

1.4 

1.4 

1.4 

1.4 

gamma  (VA5) 

0.22 

0.31 

0.4 

0.22 

0.31 

0.4 

Lambda  (1/v) 

9 

10.5 

12 

9 

10.5 

12 

Width  (um) 

30 

30 

30 

30 

30 

30 

Linearity 

The  linearity  is  shown  through  output-referred  1  dB  compression  point  when  the  LO  is  applied  to 
the  bulk  and  is  usually  expressed  in  dBm.  The  simulation  result  can  be  seen  in  Figure  A-28.  One  can 
obtain  a  higher  compression  point  by  trading  off  gain  and  noise  figure  by  interchanging  the  LO  and  RF 
[A-9,  A-13]. 


comprtssionCurves 


Figure  A-28.  ldB  Compression  Point 


Voltage  Conversion  Gain 

The  conversion  gain  is  expressed  in  dB.  The  use  of  voltage  conversion  gain  is  used  because 
power  gain  lacks  the  precisely  defined  characteristic  impedance.  It  is  defined  as  the  ratio  of  rms  voltage  of 


123 


the  RF  signal  to  the  rms  voltage  of  the  IF  signal.  The  equation  used  is  seen  in  Figure  A-29  and  the 
simulated  data  in  Figure  A-30.  RL  is  the  load  resistor,  RS  is  the  source  degeneration  resistor,  and  gm  is 
the  transconductance  of  the  NMOS  transistors. 


Gr4B  =20  log 


Z, 


(*-il 


Figure  A-29.  Voltage  Conversion  Gain  Equation 


Periodic  Steady  State  Response 


(/net4  h=2)/(/net22  h=l)  pss  dB20W/V) 


Figure  A-30.  Voltage  Conversion  Gain 


Isolation 

The  isolation  is  defined  as  the  ratio  of  signal  power  exiting  a  port  relative  to  the  power  applied  to 
another  port  Isolation  is  measured  in  dB  and  is  between  LO  and  RF  ports  of  the  mixer.  This  represents  the 
LO  signal  leaking  through.  This  leakage  should  be  small  enough  not  to  corrupt  the  desired  signal  of  the 
system.  The  equation  used  is  seen  in  Figure  A-31and  the  simulated  data  in  Figure  A-32. 


LO  to  RF  isolation  =  ^ 

Plo 


Figure  A-31.  LO  to  RF  Isolation 


Periodic  XF  Response 
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Noise  Figure 

The  noise  figure  of  mixer  a  inherently  poor  due  to  the  bulk-driven  mixer  core.  The  noise  figure  is 
measured  in  dB  and  is  how  much  noise  a  circuit  adds  to  an  input  signal.  The  noise  figure  is  dominated  by 
the  noise  performance  of  the  first  stage.  The  total  noise  factor  is  the  sum  of  these  individual  contributions. 
The  equation  used  to  find  NF  is  seen  in  Figure  A-33  and  the  NF  of  the  bulk  driven  mixer  is  seen  in  Figure 
A-34 


Ml  27  A 

4  Gm  *  Rs  +  Gl  *  Rl  *  Rs 

Figure  A-33.  Noise  Figure  Equation 
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Figure  A-34.  Noise  Figure 


Conclusions 

This  mixer  was  designed  and  simulated  in  CMOS  .18  um  process  of  Cadence.  The  circuit 
topology  uses  MOSFETs  by  applying  an  input  to  a  bulk  portion  of  a  NMOS  transistor.  The  advantage  of 
this  topology  is  low  voltage  and  low  power  operation  while  still  maintaining  the  voltage  conversion  gain 
and  similar  noise  figure  of  the  Gilbert  cell  without  using  bulk. 
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PFD  and  CP  Design  for  PLL 

Michael  Barr 


Overview  of  AFRL  Fellowship  Experience 

Receiving  the  AFRL  Fellowship  for  summer  2008  has  been  the  most  influential 
experience  of  my  undergraduate  career.  The  fellowship  introduced  me  to  the  concept  of 
graduate  level  research  and  gave  me  the  opportunity  to  work  in  the  Analog  RF  and  Mixed-Signal 
Group  (ARM AG)  at  Washington  State  University.  Participation  in  this  research  group  has 
helped  me  realize  the  significance  and  difficulty  of  IC  simulation,  testing,  and  design. 

I  began  my  work  completing  tutorials  on  the  Cadence  software  package.  To  further  develop 
skills  in  this  software  environment,  I  began  searching  for  simple  circuits  to  simulate.  At  the 
time,  a  now  graduated  doctoral  student  Parag  Upadhyaya  was  researching  the  area  of  phase- 
locked  loops,  and  I  quickly  became  interested  in  the  topic.  After  a  quick  overview  of  the  phase- 
locked  loop  structure,  I  attempted  to  simulate  each  sub-block  of  the  system. 

Phase  -  Locked  Loop  System 

The  phase-locked  loop  (PLL)  system  is  incredibly  useful  in  areas  of  wireless  communication. 
The  circuit  can  be  used  to  stabilize  the  phase  and  frequency  of  communications  channels, 
reconstruct  an  input  signal  with  less  noise,  or  even  to  modulate  or  multiply  a  signal. 
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Figure  A-35.  High-level  block  diagram  of  phase-locked  loop  system 


The  system  consists  of  a  phase  frequency  detector  that  determines  the  error  between  the 
reference  clock  and  feedback  divider  clock,  a  charge  pump  which  sources  or  sinks  charge  as  a 
function  of  clock  error,  and  a  loop  fdter  that  integrates  the  this  charge  into  a  voltage  that  controls 
the  oscillator  which  is  fed  back  through  a  frequency  divider  to  the  phase  frequency  divider.  I 
spent  the  majority  of  my  summer  experience  studying  and  simulating  the  phase  frequency 
detector  and  charge  pump  sub-blocks  of  the  phase-locked  loop  system  (Figure  A-35). 

Phase  Frequency  Detector 

The  phase  frequency  detector  (PFD)  converts  the  phase  error  into  a  voltage  related  to  the 
difference  in  reference  and  feedback  frequencies.  This  error  determines  the  amount  of  correction 
current  to  be  added  by  the  charge  pump.  A  typical  solution  to  this  problem  is  found  using  a  logic 
structure  based  on  data  flip-flops  that  track  the  states  of  the  reference  and  feedback  clocks.  I 
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created  my  own  circuit  components  comprised  mostly  of  N AND  gates  to  realize  this  solution  and 
performed  simulations  to  verify  appropriate  behaviors. 

To  obtain  better  simulation  results,  I  began  to  explore  other  types  of  transistor  logic.  I  became 
interested  in  true-single-phased  clocking  (TSPC)  techniques  and  transistor-transistor  logic  to 
increase  the  speed  and  power  dissipation  of  the  data  flip-flops  needed  to  create  the  PFD.  I 
experimented  with  different  topologies  of  flip-flops  and  PFDs  while  noting  the  differences  in 
operation  and  simulation.  Some  sample  simulation  data  from  the  phase  frequency  detector  is 
shown  in  Figure  A-36. 


Vdd  Vdd  Vdd 


Figure  A-36.  Sample  simulation  results  from  phase  frequency  detector  created  from  TSPC  logic 


Charge  Pump 

The  magnitude  of  the  phase  frequency  error  from  the  PFD  determines  the  amount  of  charge 
added  or  subtracted  to  the  loop  fdter  by  the  charge  pump.  This  effectively  controls  the  voltage 
used  to  tune  the  voltage-controlled  oscillator,  which  outputs  the  frequency  fed  back  to  the 
reference.  I  learned  the  most  from  working  with  this  particular  sub-block.  Knowing  very  little 
about  current  mismatch  problems,  I  designed  my  own  charge  pump  that  only  used  one  current 
source.  Through  simulations,  I  quickly  experienced  the  problems  of  current  mismatch  and  began 
to  review  other  designs.  Simulating  other  circuit  designs  gave  more  insight  into  feasible 
solutions  and  helped  me  find  an  appropriate  design  for  the  application  of  the  PLL.  Figure  A-37 
shows  a  sample  charge  pump  and  the  simulated  transient  and  DC  response. 
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Figure  A-37.  Sample  simulation  results  from  charge  pump  realized  with  regulated  cascode  circuit 


Conclusions 

Although  my  simulations  were  not  always  successful,  the  experience  gave  me  a  glimpse  of  the 
complex  process  required  for  circuit  design.  From  literature  review  to  simulation,  I  learned  how 
to  use  the  tools  around  me  to  explore  new  problems  and  develop  unique  solutions.  I  also  learned 
to  value  working  in  an  environment  of  intellectually  talented  individuals.  Their  presence  not 
only  helped  me  overcome  academic  problems,  but  motivated  me  to  pursue  graduate  level 
education. 
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A  Self-Regulating,  Gm-Boosted,  Complementary  Class-C  CMOS 
LC-VCO”  and  “A  Tutorial  on  Simulating  and  Designing 

On-Chip  Inductors 


William  Biederman 

Project 

As  the  demand  for  lower  power  and  low  phase  noise  oscillators  increases,  the  state-of-the-art 
technology  has  shifted  away  from  the  standard  LC-VCO  towards  more  complicated  structures.  The 
inherently  low  quality  factor  of  an  on-chip  inductor  has  created  a  designer’s  dilemma  in  choosing 
between  phase  noise  and  power  consumption.  Consequently,  the  current  figure  of  merit  (FOM)  record 
holders  use  other  types  of  VCOs  (such  as  BAW  or  FBAR),  however,  these  devices  add  cost  constraints 
for  inexpensive  components  like  wireless  sensor  nodes.  In  order  to  compete  with  the  rising  FOM 
standard  for  VCOs,  while  maintaining  full  integration,  a  new  LC  topology  is  required.  This  project  takes 
a  new  approach  to  the  classic  cross  coupled  LC-VCO  by  using  a  multi-tap  inductor  to  boost  the  gm  of  the 
cross  coupled  transistors  and  applying  amplitude  feedback  to  assist  in  the  startup  of  the  VCO  and  save 
current  in  steady  state.  This  paper  will  discuss  in  detail  the  design  of  the  oscillator  as  well  as  the 
prototype  results. 

We  chose  to  use  a  complementary  topology  for  our  VCO  design.  A  complimentary  structure 
enables  larger  output  voltage  swings  because  the  output  common  mode  is  centered  about  Vdd/2,  also 
allowing  for  better  absolute  phase  noise.  We  thus  chose  to  model  our  VCO  design  off  the  standard  cross 
coupled  LC  topology,  with  a  few  simple  changes.  The  top  level  schematic  for  the  oscillator  can  be  seen 
in  Figure  A-38. 


Figure  A-38.  Top  level  schematic  of  the  VCO 


The  cross-coupled  LC  topology  can  be  made  into  a  Class-C  by  separating  out  the  common  mode 
bias  of  the  injection  transistors.  As  a  Class-C  oscillator,  the  gate  swings  are  offset  toward  the  source  of 
the  injecting  transistors  further  than  would  be  the  case  in  a  standard  complementary  cross-coupled 
oscillator  with  similar  steady  state  swing  amplitude.  This  decreases  the  gate  stress  and,  again,  allows  for 
more  energy  to  be  stored  in  the  inductor  for  lower  phase  noise  while  maintaining  a  desirable  device 
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lifetime.  Also,  the  combination  of  Class-C  operation  and  gm  boosting  (described  next)  allows  for 
narrower  current  pulse  injection  -  the  greater  the  boosting,  the  better  the  injection. 

The  inductor  is  tapped  at  5  points  to  make  an  autotransformer.  Not  only  does  this  allow  all  node 
parasitics  to  be  resonated  out,  but  the  current  can  be  injected  and  signals  can  be  monitored  at  the  optimal 
impedances  for  their  individual  tasks.  By  tapping  the  injection  transistor  gates  from  the  highest 
impedance  part  of  the  inductor,  we  “gm  boost”  the  oscillator  by  connecting  the  cross  coupled  transistors 
to  the  outside  inductor  nodes.  This  allows  it  to  be  run  on  a  smaller  amount  of  current  in  steady  state.  This 
also  reduces  the  effect  of  1/f  noise  in  two  ways:  larger  transistors  can  be  used,  and  since  the  pink  noise 
has  an  absolute  magnitude,  increasing  the  gate  swing  reduces  its  importance.  By  tapping  the  drain  down 
lower,  it  keeps  these  devices  in  saturation  -  reducing  their  noise  and  decreasing  the  miller  multiplier  of 
the  capacitance  that  is  seen  across  the  inductor. 

The  last  important  extension  from  a  standard  topology  is  the  added  control  loop.  The  control  loop 
(Fig  A-39)  consists  of  four  peak  detecting  transistors  which  conduct  a  small  amount  of  current  at  the  peak 
of  the  swing  where  noise  in  the  current  removed  makes  the  smallest  contribution  to  the  oscillator  phase 
noise.  This  current  is  integrated  with  a  capacitor  to  obtain  a  control  voltage.  Current  sources  implemented 
as  resistors  establish  the  equilibrium  point  of  the  loop. 

Theoretical  Feedback  Loop 


Oscillator 
Core  +/- 


Vcontrol 


Implemented  Feedback  Loop 


Figure  A-39.  Schematic  of  the  implemented  feedback  loop 


At  startup,  the  bias  points  of  Ml  and  M2  (Fig  A-38)  are  drawn  down  to  give  the  FETs  a  higher 
gm  until  an  oscillation  is  started.  When  the  swings  get  large  enough,  pulses  of  current  from  the  high  side 
peak  detectors  are  injected  into  the  integrating  capacitor,  drawing  the  control  voltage  up  until  the  current 
injected  balances  the  portion  that  was  consumed  through  the  resistor  that  sets  the  equilibrium  point.  The 
bias  control  of  NFETs,  (M3  and  M4)  in  the  complementary  oscillator  is  the  mirror  image  of  the  PFET 
(Ml  and  M2)  bias  control.  There  are  pathological  states  when  the  two  systems  couple  strongly.  To 
eliminate  these  we  placed  a  bootstrap  at  the  center  tap  of  the  inductor.  This  keeps  the  common  mode  of 
the  oscillator  near  Vdd/2  where  the  two  systems  are  separable.  Simulations  confirm  that  this  is  inactive 
when  the  oscillator  had  reached  steady  state.  Oscillations  in  the  feedback  are  eliminated  by  choosing 
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different  time  constants  for  the  two  main  loops  (by  choosing  peak  detecting  transistors  with  different  I-V 
characteristics)  and  making  sure  the  time  constants  are  not  too  slow  to  respond  to  an  amplitude  change 
that  can  extinguish  the  oscillation. 

When  our  control  loop  is  implemented,  it  theoretically  allows  the  oscillator  to  operate  at  high 
efficiency  over  a  wide  range  of  conditions  without  paying  a  startup  power  penalty  or  requiring  sub- 
optimal  transistor  sizing  to  accommodate  PVT  variations.  As  long  as  VDD  is  adequately  regulated,  it  is 
likely  that  additional  stabilization  (via  an  integrated  band-gap  or  similar  reference)  is  unnecessary.  The 
VCO  and  feedback  circuitry  was  fabricated  in  a  130  nm  CMOS  process.  A  chip  die  photo  is  shown  in 
Figure  A-40.  Chip  testing  is  underway. 


Figure  A-40.  Oscillator  die  photo 
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Advanced  Gate  Models  for  Deep  Submicron  CMOS 

Circuit  Simulation 

Grayson  Dietrich 


Project 

This  Fellowship  involved  an  academic  component  of  3  credits  of  independent  study  for 
Grayson  during  Summer  Quarter,  2008. 

Grayson  was  given  the  tasks  of  making  some  characterization  measurements  on  IC 
devices  to  gain  familiarity  with  the  testing  equipment,  the  process  of  characterizing  IC  devices, 
and  the  specific  features  of  devices  built  in  a  submicron  CMOS  process.  He  was  given  DIP 
packaged  die  from  past  fabrication  projects  to  use  in  this  task.  These  were  0.25  pm  MOSIS  AMI 
CMOS  chips  which  contained  a  variety  of  test  structures  and  both  n-channel  and  p-channel 
MOSFETs  in  an  array  of  L  and  W  values.  He  first  was  taught  to  use  the  older  generation 
Tektronix  576  curve  tracer  to  validate  device  operation  over  normal  operating  conditions,  and  to 
also  gently  edge  the  devices  into  Vds  breakdown  and  to  set  proper  current  limits  to  avoid  thermal 
damage.  He  was  then  taught  to  use  the  HP-4145B  parameter  analyzer  to  automatically  extract  a 
set  of  I-V  curves  and  to  examine  the  device  operation  at  much  lower  current  levels,  extending 
into  the  subthreshold  region. 

Grayson  was  then  given  the  task  of  measuring  device  breakdown  and  ESD  tolerance  on 
the  same  set  of  chips.  He  was  taught  to  use  the  ETS-910  ESD  simulator,  first  on  discrete 
devices,  and  then  on  the  CMOS  chips  to  evaluate  the  device  ESD  tolerance  levels  to  HBM  and 
MM  pulses.  He  made  measurements  over  a  variety  of  n-channel  and  p-channel  devices  of 
varying  L  and  W  parameters,  and  he  was  able  to  observe  the  onset  of  I-V  curve  shifts,  thermal 
degradation,  and  finally  catastrophic  device  failure  (ruptured  gate  oxide).  He  tabulated  some  of 
these  results  which  showed  some  trends  in  ESD  breakdown  voltage  versus  L  and  W  for  the 
devices. 

Grayson  was  then  asked  to  perform  some  library  work  on  past  reports  of  ESD  testing  and 
device  characterization  to  relate  his  measurements  to  other  work  in  hopes  of  quantifying  the 
physical  processes  behind  the  breakdown.  He  collected  a  number  of  relevant  papers  on  this 
material,  but  the  summer  quarter  ended  before  he  was  able  to  quantify  his  measurements  in  any 
significant  detail. 
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On-Chip  Sampler 

Brian  Drost 


Abstract 

Last  school  year,  a  group  consisting  of  myself  and  two  other  Oregon  State  electrical  engineering 
seniors  completed  the  design,  tape-out,  and  testing  of  a  prototype  analog  integrated  circuit  as  a  senior 
project.  The  project  was  intended  both  as  a  final  project  for  the  electrical  engineering  undergraduate 
program  and  as  a  chance  for  all  of  the  group  members  to  learn  about  the  chip  design  process.  Part  of  the 
reason  for  the  team’s  success  was  the  AFRL  fellowship  that  myself  and  one  other  member  of  the  group 
received. 

The  goal  of  our  team's  project  was  to  design  and  implement  a  non-intrusive  sample  and  hold 
circuit.  The  intended  use  of  this  circuit  is  to  sample  nodes  internal  to  an  analog  integrated  circuit,  which 
are  not  strongly  driven,  as  part  of  built-in  self  test  for  analog  circuits.  This  means  the  circuitry  should 
have  low  input  impedance  so  it  would  not  load  the  node  being  sampled.  Then  the  sampled  signal  should 
be  amplified  so  it  can  be  digitized  by  an  analog-to-digital  converter. 

From  the  beginning,  our  advisor  suggested  down-converting  the  input  frequency.  The  idea  is  to 
use  a  sampling  frequency  near,  but  not  the  same  as,  the  input  frequency.  The  output  will  have  a 
frequency  which  is  the  difference  between  the  input  and  sampling  frequencies.  The  minimum  goal  was  to 
have  the  sampler  work  at  an  input  of  10  MHz  with  a  maximum  output  frequency  of  1  MHz,  but  we 
designed  the  circuit  to  reach  the  goal  of  100  MHz.  The  down-conversion  technique  meant  that  the  entire 
circuit  would  not  have  to  be  designed  to  operate  at  the  higher  input  frequencies,  which  would  reduce  the 
design  requirements  and  complexity  making  the  design  process  easier  for  us  as  inexperienced  analog 
designers. 

Design  Process 

Since  this  was  also  the  first  chip  we  had  designed,  it  provided  an  opportunity  to  learn  about  the 
design  process  and  the  software  tools  that  analog  designers  use.  During  the  first  months  of  the  project,  we 
were  learning  about  design  and  simulation.  This  required  considerable  time  researching  existing  circuits 
and  learning  SPICE  in  computer  labs.  At  the  same  time,  my  group  and  I  were  enrolled  in  analog/mixed 
signal  classes.  It  was  tricky  to  be  learning  the  design  concepts  while  we  were  trying  to  complete  the 
design  itself,  but  thanks  to  our  advisor  and  other  members  of  the  faculty,  we  were  able  to  avoid  some  of 
the  analog  pitfalls  and  complete  out  design. 

We  settled  on  a  two-stage  design.  Each  stage  is  a  simple  track-and-hold  circuit  consisting  of 
nothing  more  than  a  small  capacitor  and  a  MOSFET  switch.  The  first  stage  needed  to  operate  at  the  full 
100  MHz  while  having  as  small  of  input  capacitance  as  possible  Our  goal  was  to  reduce  the  capacitor  size 
without  allowing  noise  and  charge  injection  from  the  MOSFET  to  seriously  impact  the  quality  of  the 
output  signal.  The  target  was  an  input  capacitance  between  10  fF  and  100  fF. 

The  second  stage  could  be  designed  for  a  lower  frequency,  but  a  buffer  was  required  to  prevent 
the  second  stage  from  affected  the  charge  on  the  first  stage's  capacitor.  With  a  buffer  in  place,  the  2nd 
stage  capacitor  is  sized  larger  to  reduce  noise.  Both  stages  also  utilized  a  dummy  MOSFET  to  cancel  the 
charge  injection  effects  from  the  MOSFET  switch.  Although  perfectly  canceling  the  charge  injection  is 
impossible,  the  addition  of  the  dummy  transistors  created  a  considerable  improvement  in  the  quality  of 
the  output  signal  in  simulation. 

We  considered  two  possibilities  for  the  buffer  between  stages.  The  first  option  was  an  OPAMP. 
Unfortunately,  we  were  inexperienced  amplifier  designers  early  in  the  project  and  designing  a  high- 
frequency  high-gain  amplifier  is  difficult.  We  did  consider  a  one-stage  design  that  was  nothing  more  than 
a  differential  pair.  While  we  did  use  this  design  in  one  variation  of  our  circuit,  it  was  not  used  in  our  main 
design.  Instead,  we  used  a  pair  of  source  followers  which  were  simpler  to  design,  but  did  decrease  the 
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input  range  of  our  circuit.  We  did  get  a  chance  to  try  a  more  complicated  OPAMP  design  for  the  output 
buffer.  We  settled  on  a  standard  two-stage  topology  connected  in  a  unity  gain  feedback  configuration. 

The  chip  was  to  be  fabricated  by  MOSIS,  but  since  it  was  for  educational  purposes,  we  used  the 
AMI  0.5  micron  process.  Since  this  is  a  larger  process,  it  made  the  task  of  hitting  our  frequency  goals 
more  difficult  since  the  parasitic  capacitances  and  charge  injection  effects  are  larger.  However,  it  also 
allowed  us  to  avoid  some  of  the  complexities  involved  with  deep  sub-micron  processes. 

Laying-out  our  design  became  the  next  challenge.  We  used  Cadence's  ICFB  software,  which 
required  a  lot  of  learning  on  the  fly.  A  good  deal  of  time  was  spent  configuring  ICFB  to  work  with  the 
files  we  received  for  the  process  and  then  extracting  and  verifying  our  design  before  tape-out.  Our  tape- 
out  deadline  was  early  in  the  year,  since  the  finished  chip  needed  to  be  received  and  tested  by  the  end  of 
the  school  year.  Once  the  design  was  taped-out  and  being  fabricated,  we  designed  a  test  board,  which  we 
could  attach  our  chip  to.  The  board  included  an  analog-to-digital  converter  and  a  microcontroller  to 
provide  clocks,  sample  the  output,  and  store  the  results.  The  microcontroller  could  then  transmit  the  data 
to  a  PC  over  USB  where  it  could  be  analyzed  in  MatLab. 

Testing  and  Improvements 

The  test  results  for  our  chip  were  almost  entirely  positive.  The  circuitry  did  function  as  designed, 
although  we  were  not  able  to  include  circuitry  to  test.  As  a  result,  we  could  not  test  our  design  as  it  is 
intended  to  be  used.  The  down-conversion  technique  worked  as  expected  and  we  were  able  to  read 
signals  more  than  10  MHz,  but  did  not  get  an  opportunity  to  test  100  MHz  The  magnitude  of  charge 
injection  was  larger  than  predicted  by  our  simulations,  so  we  have  to  question  the  effectiveness  of  the 
dummy  transistor  technique. 

As  a  first  chip  design,  the  project  was  successful.  My  group  learned  the  full  process  from  design 
and  simulation  through  layout  to  testing.  Our  design,  while  not  complicated,  did  meet  its  goals.  We  were 
able  to  demonstrate  our  functioning  design  at  the  Oregon  State  engineering  fair  as  our  completed  senior 
project. 

Our  design  certainly  had  plenty  of  room  for  improvement.  A  next  stage  design  would  include  a 
multiplexer  to  select  from  a  range  of  input  signals  and  an  analog-to-digital  converter  to  digitize  the  output 
signal  before  sending  it  off-chip.  Also  redesigning  the  circuitry  for  a  smaller  process  would  increase  the 
performance  since  it  would  reduce  the  parasitic  capacitances  and  charge  injection  that  limit  the  maximum 
input  frequency. 

Conclusions 

Before  my  senior  year,  I  was  not  sure  what  field  of  electrical  engineering  I  wanted  to  specialize 
in.  When  I  joined  this  project  group  to  design  our  on-chip  sampler,  I  enrolled  in  analog/mixed  signal 
classes  so  that  I  could  contribute  as  much  as  possible  to  the  group  and  the  design.  However,  through 
these  classes,  this  project,  and  this  fellowship,  I  learned  that  the  analog/mixed  signal  field  is  a  growing 
field,  and  being  knowledgeable  in  this  area  can  provide  many  job  opportunities  even  outside  the  field 
itself.  I  am  now  seriously  considering  continuing  on  to  graduate  school  in  this  the  analog/mixed  signal 
area.  I  would  like  to  thank  the  AFRL  Fellowship  program  for  assisting  me  and  my  group  with  this  project 
and  helping  me  to  become  interested  in  the  analog/mixed  signal  field. 
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A  Multi-channel  D/A  Converter 

Saeed  Ghezawi,  Kevin  Omoumi,  Zack  Pannell,  Adrianne  Thrash 


Research  Summary 

Undergraduate  students  Saeed  Ghezawi,  Kevin  Omoumi,  Zack  Pannell,  and  Adrianne  Thrash 
worked  together  to  develop  a  multi-channel  D/A  converter.  For  proof  of  principle  purposes,  single 
channel  of  an  8 -bit  DAC  was  developed  and  simulated.  Principle  of  operation  is  as  follows.  The  DAC 
uses  a  single  high  quality  current  reference  to  drive  the  gates  of  multiple  parallel  current  mirrors.  These 
current  mirrors  are  either  sized  appropriately  or  if  they  are  floating  gates,  are  programmed  appropriately, 
such  that  they  provide  a  changing  bias  current  to  a  traditional  9  transistor  opamp  circuit.  See  Figure  A- 
41 .  Floating  gate  transistors  do  not  simulate  well  (due  to  an  DC  isolated  node),  so  for  the  purposes  of 
simulation,  appropriately  sized  transistors  were  used.  In  a  few  cases,  this  can  lead  to  large  transistors. 
However,  floating  gate  transistors  allow  for  these  devices  to  all  approach  minimum  size.  Since  they  are 
programmable,  the  currents  in  them  can  be  tuned  to  the  right  value.  Current  floating  gate  programming 
techniques  seem  to  allow  for  greater  than  12-bit  accuracies.  Since  the  purpose  of  this  is  make  very  small 
multi-channel  DACs,  keeping  all  transistors  small  is  important,  but  because  this  was  not  to  fabricated  in 
these  early  stages,  the  simulated  sizes  of  the  transistors  were  not  an  issue. 


Figure  A-42  shows  simulated  output  of  the  opamp  circuit  for  various  bias  currents  that  generated 
from  the  current  mirror  array.  Speed  of  this  circuit  is  limited  by  slew  rates  in  the  opamps  (only  1  opamp 
per  output  channel). 


Figure  A-42.  Voltages  for  various  digital  codes  (currents) 
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AFRL  Oscilloscope  on  a  Chip  Research  Summary 

Jon  Grueber 


Abstract 

Sample  and  hold  circuits  are  essential  to  the  function  of  many  integrated  circuit  interfaces.  With 
proper  modification  of  a  switched  capacitor  sample  and  hold,  it  is  possible  to  output  a  periodic  frequency 
that  is  the  difference  of  the  input  frequency  and  the  MOS  switching  frequency.  Thanks  to  the  AFRL 
fellowship,  this  phenomenon  was  investigated  and  an  “Oscilloscope  on  a  Chip”  was  fabricated. 

Introduction 

Integrated  circuits  are  quickly  starting  to  dominate  the  technological  advancements  in  the  field  of 
electronics  design.  These  minute  devices  can  often  perform  tasks  that  would  have  challenged  room  sized 
computers  just  50  years  ago.  While  integrated  circuits  (ICs)  do  a  great  job  in  efficiently  processing  digital 
or  binary  data,  the  interface  between  this  and  the  analog  or  continuous  world  is  often  a  major  difficulty. 
The  workhorse  of  many  of  these  interfaces  is  the  sample  and  hold  circuit,  which  samples  analog 
waveforms  and  stores  them  temporarily,  allowing  for  processing  by  a  distinct  analog  to  digital  converter 
transistor  structure.  Often  the  sample  and  hold  circuit  can  be  a  limiting  factor  in  determining  the  effective 
number  of  bits  of  data  that  can  be  resolved  from  an  analog  input  waveform  when  translating  to  a  digital 
word.  The  research  I  did  during  my  senior  year,  thanks  to  the  AFRL  Fellowship,  involved  designing  a 
sample  and  hold  circuit  and  investigating  the  usage  of  these  circuits  as  low  data  rate  periodic 
oscilloscopes.  Sample  devices  were  fabricated  and  effects  from  charge  injection  and  clock  feedthough  on 
these  devices  was  studied. 

Research  Summary 

A  sample  and  hold  circuit  is  a  simple  switched  capacitor  CMOS  structure  that  can  sample  an 
input  waveform  and  output  a  sampled  version  of  the  waveform  at  a  sampling  rate  dictated  by  the 
switching  frequency  (as  seen  simplified  in  Figure  A-43). 

For  normal  sampling  operations,  the  rate  of  sampling  is  dictated  by  that  Nyquist  Theorem  which 
states  that  for  non-aliased  output,  the  sampling  frequency  must  be  greater  than  twice  the  highest 
component  of  the  input  frequency.  If  this  sampling  rate  is  not  met,  then  the  original  output  cannot  be 
recovered  from  the  sampled  input  under  all  circumstances  [A- 14]. 

When  the  input  to  a  sample  and  hold  circuit  is  no  longer  a  random  transient,  but  is  instead 
periodic,  an  undersampled  input  signal  will  be  aliased,  but  in  interesting  ways.  It  is  clear  that  when 
sampling  a  periodic  signal  at  exactly  the  periodic  frequency,  the  output  will  always  be  the  same  value, 
thus  the  output  will  be  at  DC  (0Hz).  When  the  sampling  frequency  is  slightly  greater  than  the  input 
periodic  frequency,  a  sampled  output  voltage  waveform  of  the  magnitude  as  the  input  but  as  a  much  lower 
frequency  will  appear. 


Clk  Clk 

Analog  Input  He 

d  Output 

Figure  A-43.  Basic  Sample  and  Hold  Circuit 
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It  can  be  shown  that  this  frequency  (often  called  the  beat  frequency)  will  be  the  difference  of  the 
clocked  MOS  frequency  and  the  input  frequency  [A-15].  The  fact  that  a  discrete  and  scaled  down  version 
of  the  input  frequency  appears  at  the  output  means  that  sample  and  hold  circuits  can  act  as  an 
oscilloscope,  digitizing  high  frequency  signals  into  lower  frequency  representations  that  can  then  be 
analyzed. 

The  project  taken  on  with  the  help  of  the  ARFL  fellowship  was  to  fabricate  some  of  these  sample 
and  hold  oscilloscopes,  investigate  second  order  effects,  and  display  the  oscilloscope  functionality  on  a 
computer  output. 

To  investigate  the  beat  frequency  effect  of  sample  and  hold  circuits,  two  sample  and  hold  circuits 
with  measures  to  attempt  and  regulate  detrimental  second  order  effects.  The  main  affect  we  were  trying 
to  eliminate  was  charge  injection.  Charge  injection  occurs  in  all  switched  capacitor  circuits  and  is  when 
the  charge  trapped  in  the  depletion  region  of  a  MOS  transistor  in  the  on  state  is  pushed  out  of  the  device 
in  the  off  state.  To  prevent  this  charge  from  coupling  between  the  two  stages  of  the  sample  and  hold 
circuit,  an  opamp  source  follower  was  placed  in  the  first  of  our  circuits  and  a  source  follower 
implementation  was  placed  as  a  buffer  between  the  stages  of  the  second.  To  reduce  the  overall  output 
impedance  of  the  circuit  another  opamp  buffer  was  placed  at  the  output. 

These  circuits  were  simulated,  laid  out,  and  fabricated  in  a  .5u  process  using  the  MOSIS 
fabrication  program.  To  complement  the  chip,  a  board  was  designed  with  supply  voltages  and  a 
microcontroller  to  regulate  both  the  clock  and  the  input  signal  going  into  the  sample  and  hold  circuits. 

This  microcontroller  also  contained  options  for  improving  the  linearity  of  the  output  sampled  signal  with 
the  use  of  a  calibration  lookup  table  built.  Finally,  the  sampled  output  was  transferred  to  a  graphical  user 
interface  our  team  had  developed  to  adjust  the  frequency  axis  of  the  output  waveform  by  the  difference  of 
the  input  and  output  periodic  frequencies,  thus  performing  the  same  operation  as  an  oscilloscope. 


Figure  A-44.  Oscilloscope  on  a  chip  layout 


Using  the  fabricated  chip  and  interface,  an  input  waveform  was  successfully  plotted  and  scaled 
based  on  the  difference  of  the  clocking  and  input  periodic  frequencies.  Shown  in  Figure  A-45  is  a  lKHz 


143 


sine  wave  output  from  the  chip  that  has  had  its  axis  scaled  by  the  computer  interface  program  we 
designed.  This  output  shows  no  noticeable  distortion  thanks  to  the  opamp  buffers  and  other  signal 
protecting  measures. 


OutjHjt  WBwefonfl 


Figure  A-45.  Output  Waveform  based  on  a  lKhz  input  sine  wave 


Further  measurements  have  shown  that  the  chip  is  capable  of  correctly  displaying  inputs  of  up  to 
1MHZ  with  less  than  5  percent  magnitude  error. 

Conclusions 

Thanks  to  funding  from  the  AFRL  fellowship  program,  a  sample  and  hold  circuit  with  sampling 
frequencies  below  the  nyquist  rate  was  able  to  down  convert  input  periodic  waveforms  to  lower  frequency 
images.  This  low  data  rate  information  was  than  successfully  sent  to  a  computer  and  plotted  to  examine 
both  the  functionality  of  the  system  and  the  affect  of  second  order  charge  injection  and  clock  feedthough. 
With  the  designed  sample  and  hold  circuit,  these  second  order  affects  were  significantly  decreased. 
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High-Speed  FPGA  Test  Board 

Daniel  Hubert  and  Prof.  George  S.  La  Rue 


Abstract 

Our  group  designs  and  lays  out  high-speed  analog-to-digital  converters  (ADCs),  digital-to-analog 
converters  (DACs)  and  wireline  transceivers.  After  the  integrated  circuits  (ICs)  are  received  from 
semiconductor  fabrication,  they  need  to  be  tested  for  functionality  and  characterized.  The  data  rates 
required  to  test  these  circuits  is  increasing  well  beyond  10  Gbps.  This  project  was  to  design  and  layout  a 
printed  circuit  board  (PCB)  containing  an  FPGA  and  a  large  amount  of  memory  to  acquire  and/or  provide 
data  with  high  throughput  rates  to  help  test  our  ICs.  The  schematic  has  been  completed  and  layout  is  in 
progress. 

Project  Description 

To  test  ADCs,  the  board  needs  to  acquire  and  store  a  block  of  data  into  memory.  A  computer  can  then 
read  the  memory  and  analyze  the  performance.  We  recently  fabricated  a  12-bit  ADC  at  1  Gbps,  which 
requires  an  acquisition  rate  of  12  Gbps.  To  test  DACs,  a  pattern  can  be  stored  in  memory  and  output  to 
the  DAC.  Similarly  a  14-bit  DAC  at  2  GSps  requires  28  Gbps  of  stimulus  data.  We  expect  to  fabricate 
transceivers  that  operate  to  40  Gbps  in  the  near  future.  These  can  be  tested  using  linear  feedback  shift 
registers  on  the  FPGA. 

We  selected  the  Lattice  SC  FPGA  for  this  project  because  each  I/O  can  operate  up  to  2  Gbps.  There  are  8 
serializer-deserializer  (serdes)  channels  on  the  FPGA  in  the  900  pin  BGA  package  that  can  go  up  to  3.8 
Gbps.  The  Lattice  SC  also  has  a  DDR2  memory  interface  that  can  operate  up  to  about  667  Mbps.  We 
chose  to  use  a  notebook-size  200-pin  SODIMM  memory  module  since  4  GB  modules  are  inexpensive  and 
will  be  sufficient  for  almost  all  applications.  With  a  non-ECC  data  bus  width  of  64  bits,  the  throughput 
can  be  up  to  42  Gbps  per  SODIMM.  Lattice  Semiconductor  makes  a  development  board  [A- 16]  that  is 
close  to  what  we  needed.  We  followed  their  schematic  but  removed  unnecessary  functions  and  added 
connectors  to  our  test  interface.  We  used  Eagle  PCB  software.  Symbols  and  component  pads  were  made 
and  then  the  schematic  was  entered.  Figure  A-46  shows  the  first  page  of  over  10  schematic  pages.  This 
page  instantiates  the  FPGA.  The  many  power  supply  connections  are  shown  on  the  left.  The  memory 
controller  connections  are  shown  in  the  upper  right.  The  serdes  and  other  I/O  connections  are  near  the 
bottom.  Figure  A-47  shows  the  PCB  layout  that  has  been  completed.  The  components  are  placed  near  to 
their  final  positions  but  most  of  the  wiring  has  not  been  completed. 

Summary 

A  PCB  is  being  designed  to  provide  or  acquire  data  up  to  40  Gbps  using  an  FPGA  and  a  laptop  computer 
memory  module.  The  memory  bandwidth  is  over  40  Gbps  with  a  depth  of  32  Gbit.  The  FPGA  will 
multiplex  the  64-bit  wide  data  bus  from  the  memory  to  32  lines  at  up  to  1 .3  Gbps  or  to  8  lines  up  to  3.8 
Gbps.  The  schematic  for  a  PCB  has  been  completed  and  layout  is  in  progress. 
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Figure  A-46.  One  page  of  the  schematic  that  implements  the  FPGA  connections 


Figure  A-47.  Printed  circuit  board  layout 
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A  Tutorial  on  Simulating  and  Designing  On-Chip  Inductors 

The  following  is  a  brochure  created  by  the  AFRL  fellowship  award. 

It  provides  an  introduction  to  on-chip  electromagnetic  element  simulation. 

Fedja  Karalic 
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Coupling  Suppression  in  Integrated  Circuits  Using 

Dummy  Metal  Fill 

Chee-Sing  Lee 


Research  Summary 

I  assisted  Professor  Andreas  Weisshaar's  research  team  with  a  project  considering  the  parasitic  effects  of 
metal  fills  in  high  speed  integrated  circuits.  Metal  fills  are  pads  of  metal  that  are  positioned  in  an  IC  for 
structural  purposes.  At  high  frequencies  these  can  exhibit  capacitive  effects  that  may  affect  the 
performance  of  the  circuit. 

The  research  group  used  two  programs  for  simulating  the  metal  fills:  COMSOL  for  two-dimensional 
simulations,  and  Q3D  for  three-dimensional  simulations.  The  simulation  generates  the  capacitance  matrix 
from  a  physical  metal  fill  geometry  to  be  tested.  This  matrix  can  then  be  used  for  simulations  with  SPICE 
to  model  the  parasitic  effects. 

Rather  than  considering  a  particular  geometry,  the  research  team  wishes  to  analyze  a  general 
configuration,  in  particular  a  rectangular  grid  of  fill  pads.  Such  a  configuration  can  vary  along  several 
parameters:  the  number  of  rows  and  columns,  the  spacing  between  the  fills,  height  of  the  fills  above  the 
ground  plane,  and  the  dimensions  of  each  fill  pad.  Drawing  these  different  combinations  by  hand  in  Q3D 
and  COMSOL  would  be  very  tedious.  My  task  was  to  create  scripts  for  the  two  applications  in  order  to 
automatically  generate  the  geometries  and  simulate  the  capacitance  matrices. 

The  Q3D  scripting  engine  is  based  on  VBScript.  With  the  methods  supplied  by  Q3D,  I  was  able  to  write  a 
script  that  automatically  draws  the  fill  geometry  and  runs  the  simulation,  returning  the  capacitance  matrix. 
I  also  created  a  Python  wrapper  script  to  iterate  through  a  range  of  geometry  parameters,  calling  the  Q3D 
script  each  time.  COMSOL  uses  a  MATLAB-like  scripting  engine,  again  providing  unique  methods  for 
control  of  the  application.  Because  the  COMSOL  models  are  two-dimensional,  some  parameters  were 
excluded,  such  as  the  height  above  the  ground  plane  and  the  thickness  of  the  fill  pads.  The  COMSOL 
simulation  process  runs  with  one  fill  designated  as  the  input  port,  and  returns  a  single  column  of  the 
capacitance  matrix.  Thus,  to  generate  the  entire  matrix  successive  simulations  need  to  be  run,  changing 
the  input  port  in  between.  This  was  hindered  by  a  limitation  of  the  COMSOL  scripting  engine,  which 
does  not  provide  a  means  to  automatically  adjust  the  input  ports.  An  alternate  solution  was  also  explored 
using  the  Autolt.  Autolt  is  a  scripting  language  for  Microsoft  Windows  that  can  be  used  to  automatically 
“drive”  an  application  by  generating  mouse  clicks  and  text  entries.  However,  only  controls  created  by 
Windows  graphical  user  interface  are  visible  to  Autolt.  The  COMSOL  window  is  drawn  with  JAVA,  so 
this  attempt  was  also  unsuccessful. 
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Design  of  a  Digital  to  Analog  Converter  for  Implementation  in  an 
On-Chip  Calibration  Interface 

Richard  Przybla 


Research  Summary 

Under  the  support  of  the  AFRL  Fellowship,  we  designed  two  CMOS  mixed-signal 
integrated  circuits,  which  validated  a  calibration  interface  designed  by  the  student  and  his  senior 
project  group.  This  work  partially  satisfied  the  requirements  for  the  student’s  honors  college 
thesis  project  as  well  as  for  the  student’s  group’s  senior  project. 

Analog  integrated  systems,  especially  those  that  are  prototypical,  require  calibration  to 
achieve  optimum  performance.  The  simplest  method  to  calibrate  these  analog  systems  is  use  of  a 
pin  for  each  node  that  must  be  calibrated.  However,  this  method  is  undesirable  for  several 
reasons:  it  increases  the  number  of  pins  in  the  package,  which  increases  the  cost  and  the 
parasitics  associated  with  the  packaging;  and  it  increases  the  area  and  therefore  the  cost  of  the 
die.  This  method  also  does  not  provide  a  simple  way  to  automate  the  calibration  process  without 
the  use  of  a  complicated  test  setup  involving  either  another  integrated  circuit  or  several  power 
supplies. 

Digital  systems  have  a  similar  problem:  they  often  use  data  in  a  parallel  manner. 
Supplying  a  parallel  digital  stimulus  to  the  chip  requires  as  many  pins  as  the  digital  bus  is  wide. 
Since  digital  systems  are  often  8,  16,  24,  or  32  bits  wide,  a  parallel  interface  is  too  costly  in  terms 
of  the  number  of  pins  that  it  uses.  A  serial  interface  allows  an  arbitrarily  wide  digital  stimulus  to 
be  applied  to  the  system. 

Our  senior  project  group  of  three  students  developed  a  solution  to  this  problem:  an  on- 
chip  serial  interface  that,  when  coupled  with  a  digital-to-analog  converter  (DAC),  allows  the 
application  of  both  analog  and  digital  stimuli  to  the  circuitry  under  test.  The  interface  is  called 
the  Mixed-Signal  Test  Interface  (MTI).  It  is  ideal  for  mixed-signal  systems  that  need  both  analog 
and  digital  outputs,  but  can  also  be  used  for  pure  digital  or  pure  analog  systems. 

The  interface  consists  of  a  serial-peripheral  interface  (SPI)  compatible  digital  register. 

The  serial-in,  parallel-out  characteristic  of  this  register  allows  4  pins  to  be  used  to  control  an 
arbitrary  number  of  digital  outputs  to  the  circuitry  under  test.  Some  or  all  of  these  digital  outputs 
can  be  connected  to  a  digital-to-analog  converter  (DAC)  that  was  designed  by  the  student  that 
this  fellowship  supported. 

The  MTI  is  designed  to  be  a  specification  for  a  test  interface  that  can  be  implemented  by 
graduate  students  on  their  integrated  circuits  for  calibration  purposes.  However,  to  prove  the 
specification  we  implemented  two  iterations  of  the  design  in  AMIS’s  C5N  0.5  pm  CMOS.  Each 
implementation  included  several  SPI  compatible  registers  and  several  DACs. 

The  architecture  for  the  DAC  was  chosen  to  be  a  current  steering  topology.  The 
segmentation  was  chosen  to  be  5MSBs  thermometer,  5LSBs  binary.  Segmentation  allows 
reduction  in  the  complexity  and  die  area  of  the  decoder  required  for  translating  binary  inputs  to 
thermometer  code,  while  still  achieving  accuracy  requirements  that  binary  steering  elements 
alone  cannot  realize.  Figure  A-48  shows  the  block  diagram  of  the  current  steering  DAC. 
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Figure  A-49  shows  a  die  micrograph  of  one  of  the  integrated  systems  that  the  group 
designed.  It  has  three  SPI-compatible  registers,  of  10,  100,  and  1000  bits  respectively,  and  five 
10-bit  DACs. 

The  support  of  the  AFRL  was  greatly  appreciated  and  facilitated  the  successful 
completion  of  this  project. 


Figure  A-48.  DAC  schematic.  The  upper  right  inset  shows  the  biasing  circuit  and  the  upper  left 

inset  shows  the  current  source  and  switch 


Figure  A-49.  Chip  Micrograph.  The  DACs  are  along  the  right  side  of  the  chip.  The  center  of  the 
chip  is  dominated  by  the  1000  bit  SPI  register,  and  the  100  bit  and  10  bit  SPI  registers 

are  at  top  left 
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APPENDIX  B 

DD  Form  882,  Invention  Reports  (2) 
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DD  FORM  882,  JUL  2005  PREVIOUS  EDITION  IS  OBSOLETE. 


LIST  OF  ACRONYMS  AND  SYMBOLS 


A/D 

Analog  to  digital 

ADC 

Analog-to-digital  converter 

ADS 

Analog  design  system 

AFE 

Analog  front  end 

AFRL 

Air  Force  Research  Laboratory 

Ah 

Ampere-hour(s) 

Amp 

Ampere 

ANSI 

American  National  Standards  Institute 

ASIC 

Application-specific  integrated  circuit 

BAW 

Bulk  acoustic  wave 

BER 

Bit  error  rate 

BSIM 

Berkeley  Short-channel  IGFET  Model 

BWLS 

Bandwidth,  large  signal 

BWSS 

Bandwidth,  small  signal 

CAD 

Computer  aided  design 

CAN 

Controller  area  network 

CBR 

Constant  bit  rate 

CDADIC 

Center  for  Design  of  Analog-Digital  Integrated  Circuits 

CDC 

Clock  distribution  circuit 

CDD 

Clock  distribution  device  (or  driver) 

CDF 

Comparator  density  function 

CDR 

Clock  and  data  recovery 

CLS 

Correlated  level  shifting 

CMF 

Current-mode  feedback 

CMFB 

Common-node  feedback 

CML 

Current-mode  logic 

CMOS 

Complementary  Metal-Oxide-Semiconductor 

DAC 

Digital-to-analog  converter 

D/A 

Digital  to  analog 

dB 

decibel 

DCO 

digitally  controlled  oscillator 

DCS 

Digital-controlled  clock  synthesizer 

DDI 

Digital  data  input 

DIO 

Data  input/output 

DNL 

Differential  nonlinearity 

DVM 

Digital  voltmeter 

DWA 

Data-weighted  averaging 

EAR 

Export  Administration  Regulations 

EMI 

Electromagnetic  interference 

ENOB 

Effective  number  of  bits 

EOT 

Equivalent  (Si02)  oxide  thickness 

ESD 

Electrostatic  discharge 

FET 

Field  effect  transistor 

FF 

Flip  flop 

FFT 

Fast  Fourier  Transform 

FPGA 

Field  Programmable  Gate  Array 
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GBIC 

Gigabit  interface  converter 

Gbps 

Gigabytes  per  second 

GBW 

Gain  bandwidth 

GHz 

Gigahertz 

HF 

High  frequency 

IC 

Integrated  circuit 

ICA 

Integrated  circuit  accumulator 

iCAB 

Inverse  class-AB  detector 

INL 

Integral  nonlinearity 

I/O 

Input/output 

ITAR 

International  Traffic  in  Arms  Regulations 

KHz 

Kilohertz 

KV 

Kilovolt 

kVA 

kilovolgtampere 

LC 

Inductance  capacitance 

LCR 

Inductance  capacitance  resistance 

LNA 

Low  noise  amplifier 

LO 

Local  Oscillator 

LSI 

Large  scale  integration 

MASH 

Multi-stage-noise-shaping 

Mbps 

Megabytes  per  second 

MDAC 

Multiplying  digital-to-analog  converter 

MEMS 

Microelectromechanical  systems 

MESFET 

Metal-Semiconductor  Field-Effect-Transistor 

MHz 

Megahertz 

MIMO 

Multiple  input,  multiple  output 

MOSFET 

Metal  Oxide  Semiconductor  Field  Effect  Transistor 

MPV 

Multiproject  wafer 

mV 

Millivolt 

jLim 

Micrometre 

nA 

Nanoampere 

n 

Nano  (10-9) 

nm 

Nanometer 

nMOS 

An  n-channel  metal-oxide  semiconductor  transistor 

NPN 

Negative-positive-negative 

ns 

Nanosecond 

nsec 

Nanosecond  (Millimicrosecond) 

nW 

Nano  watt 

Op  amp 

Operational  amplifier 

OTA 

Operational  transconductance  amplifier 

pA 

Picoampere(s) 

PDF 

Probability  density  function 

PFD 

Phased-frequency  detector 

PLINCO 

Passive  linear  counter 
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PLL 

Phase  locked  loop 

PM 

Phase  modulation 

pMOS 

A  p-channel  metal-oxide  semiconductor  transistor 

PRC 

Parasitic  resistance  cancellation 

PV-S 

Picovolt  second(s) 

PVT 

Process,  supply,  and  temperature 

RAM 

Random  access  memory 

RBB 

Reverse  body  bias 

RC 

Resistor-capacitor 

RF 

Radio  frequency 

RFI 

Radio  frequency  interference 

RMS 

Root  mean  square 

ROM 

Read-only  memory 

RTC 

Real-time  clock 

RX 

Receive 

SCF 

Switched-capacitor  filter 

SCR 

Silicon-controlled  rectifier 

SCT 

Single  chip  transceiver 

SFDR 

Spurious-free  dynamic  range 

SiGe 

Silicon  germanium  process 

SNDR 

Signal-to-noise  and  distortion  ratio 

SNR 

Signal-to-noise  ratio 

SOC 

System  on  a  chip 

SPECTRE 

Spectre  is  a  SPICE-class  circuit  simulator 

SPICE 

Simulation  Program  with  Integrated  Circuit  Emphasis 

SQNR 

Signal-to-quantization-noise  ratio 

TDC 

Time-to-digital  converter 

T/H 

Track/hold 

TDR 

Time-delay  relay 

TMR 

Triple-mode  redundancy 

TTL 

Transistor-to-transistor  logic 

T/R 

Transmit/receive 

UGB 

Unity-gain  bandwidth 

V-s 

Volt-second(s) 

V/F 

V  oltage-to-frequency 

vccs 

Voltage  controlled  current  source 

vco 

Voltage  controlled  oscillator 

VFO 

Variable-gain  oscillator 

VGA 

Variable-gain  amplifier 

VLF 

Very-low  frequency 

VLSI 

Very  large-scale  integration 

ZCD 

Zero-crossing  detector 
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